aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/rdd.py
Commit message (Expand)AuthorAgeFilesLines
* [SPARK-3886] [PySpark] simplify serializer, use AutoBatchedSerializer by defa...Davies Liu2014-11-031-54/+37
* [SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sampleXiangrui Meng2014-11-031-3/+0
* [SPARK-4150][PySpark] return self in rdd.setNameXiangrui Meng2014-10-311-2/+2
* [Spark] RDD take() method: overestimate too muchyingjieMiao2014-10-131-1/+4
* [SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and buildin...cocoatomo2014-10-111-1/+1
* [SPARK-3412] [PySpark] Replace Epydoc with Sphinx to generate Python API docsDavies Liu2014-10-071-26/+26
* [SPARK-3773][PySpark][Doc] Sphinx build warningcocoatomo2014-10-061-0/+1
* [SPARK-3749] [PySpark] fix bugs in broadcast large closure of RDDDavies Liu2014-10-011-3/+9
* [SPARK-3478] [PySpark] Profile the Python tasksDavies Liu2014-09-301-2/+8
* Revert "[SPARK-3478] [PySpark] Profile the Python tasks"Josh Rosen2014-09-261-8/+2
* [SPARK-3478] [PySpark] Profile the Python tasksDavies Liu2014-09-261-2/+8
* [SPARK-546] Add full outer join to RDD and DStream.Aaron Staple2014-09-241-2/+23
* [SPARK-3491] [MLlib] [PySpark] use pickle to serialize data in MLlibDavies Liu2014-09-191-5/+5
* [SPARK-3554] [PySpark] use broadcast automatically for large closureDavies Liu2014-09-181-0/+4
* [SPARK-3519] add distinct(n) to PySparkMatthew Farrellee2014-09-161-2/+2
* [SPARK-1087] Move python traceback utilities into new traceback_utils.py file.Aaron Staple2014-09-151-55/+3
* [PySpark] Add blank line so that Python RDD.top() docstring renders correctlyRJ Nowling2014-09-121-0/+1
* SPARK-2978. Transformation with MR shuffle semanticsSandy Ryza2014-09-081-0/+24
* [SPARK-2334] fix AttributeError when call PipelineRDD.id()Davies Liu2014-09-061-0/+6
* Spark-3406 add a default storage level to python RDD persist APIHolden Karau2014-09-061-1/+6
* SPARK-3211 .take() is OOM-prone with empty partitionsAndrew Ash2014-09-051-4/+4
* [SPARK-3309] [PySpark] Put all public API in __all__Davies Liu2014-09-031-0/+1
* [SPARK-2871] [PySpark] add countApproxDistinct() APIDavies Liu2014-09-021-5/+34
* [SPARK-2871] [PySpark] add RDD.lookup(key)Davies Liu2014-08-271-132/+79
* [SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey()Davies Liu2014-08-261-2/+7
* [SPARK-2871] [PySpark] add histgram() APIDavies Liu2014-08-261-1/+128
* [SPARK-2871] [PySpark] add zipWithIndex() and zipWithUniqueId()Davies Liu2014-08-241-0/+47
* [SPARK-2871] [PySpark] add approx API for RDDDavies Liu2014-08-231-0/+81
* [SPARK-2871] [PySpark] add `key` argument for max(), min() and top(n)Davies Liu2014-08-231-17/+27
* [SPARK-3141] [PySpark] fix sortByKey() with take()Davies Liu2014-08-191-10/+8
* [SPARK-2790] [PySpark] fix zip with serializers which have different batch si...Davies Liu2014-08-191-0/+25
* [SPARK-3114] [PySpark] Fix Python UDFs in Spark SQL.Josh Rosen2014-08-181-1/+1
* [SPARK-3103] [PySpark] fix saveAsTextFile() with utf-8Davies Liu2014-08-181-1/+3
* [SPARK-1065] [PySpark] improve supporting for large broadcastDavies Liu2014-08-161-2/+3
* [SPARK-2983] [PySpark] improve performance of sortByKey()Davies Liu2014-08-131-23/+24
* [PySpark] Add blanklines to Python docstrings so example code renders correctlyRJ Nowling2014-08-061-0/+9
* [SPARK-2627] [PySpark] have the build enforce PEP 8 automaticallyNicholas Chammas2014-08-061-9/+13
* [SPARK-2010] [PySpark] [SQL] support nested structure in SchemaRDDDavies Liu2014-08-011-4/+4
* [SPARK-2024] Add saveAsSequenceFile to PySparkKan Zhang2014-07-301-0/+114
* [SPARK-2601] [PySpark] Fix Py4J error when transforming pickleFilesJosh Rosen2014-07-261-3/+1
* [SPARK-2656] Python version of stratified samplingDoris Xin2014-07-241-2/+23
* [SPARK-2538] [PySpark] Hash based disk spilling aggregationDavies Liu2014-07-241-21/+71
* [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by d...Prashant Sharma2014-07-241-2/+2
* [SPARK-2494] [PySpark] make hash of None consistant cross machinesDavies Liu2014-07-211-3/+32
* Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.Prashant Sharma2014-07-141-58/+92
* [SPARK-2061] Made splits deprecated in JavaRDDLikeAnant2014-06-201-2/+2
* SPARK-1868: Users should be allowed to cogroup at least 4 RDDsAllan Douglas R. de Oliveira2014-06-201-7/+15
* SPARK-2203: PySpark defaults to use same num reduce partitions as map sideAaron Davidson2014-06-201-3/+18
* SPARK-2146. Fix takeOrdered docSandy Ryza2014-06-171-1/+1
* SPARK-1063 Add .sortBy(f) method on RDDAndrew Ash2014-06-171-0/+12