aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/rdd.py
Commit message (Expand)AuthorAgeFilesLines
...
* [SPARK-2334] fix AttributeError when call PipelineRDD.id()Davies Liu2014-09-061-0/+6
* Spark-3406 add a default storage level to python RDD persist APIHolden Karau2014-09-061-1/+6
* SPARK-3211 .take() is OOM-prone with empty partitionsAndrew Ash2014-09-051-4/+4
* [SPARK-3309] [PySpark] Put all public API in __all__Davies Liu2014-09-031-0/+1
* [SPARK-2871] [PySpark] add countApproxDistinct() APIDavies Liu2014-09-021-5/+34
* [SPARK-2871] [PySpark] add RDD.lookup(key)Davies Liu2014-08-271-132/+79
* [SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey()Davies Liu2014-08-261-2/+7
* [SPARK-2871] [PySpark] add histgram() APIDavies Liu2014-08-261-1/+128
* [SPARK-2871] [PySpark] add zipWithIndex() and zipWithUniqueId()Davies Liu2014-08-241-0/+47
* [SPARK-2871] [PySpark] add approx API for RDDDavies Liu2014-08-231-0/+81
* [SPARK-2871] [PySpark] add `key` argument for max(), min() and top(n)Davies Liu2014-08-231-17/+27
* [SPARK-3141] [PySpark] fix sortByKey() with take()Davies Liu2014-08-191-10/+8
* [SPARK-2790] [PySpark] fix zip with serializers which have different batch si...Davies Liu2014-08-191-0/+25
* [SPARK-3114] [PySpark] Fix Python UDFs in Spark SQL.Josh Rosen2014-08-181-1/+1
* [SPARK-3103] [PySpark] fix saveAsTextFile() with utf-8Davies Liu2014-08-181-1/+3
* [SPARK-1065] [PySpark] improve supporting for large broadcastDavies Liu2014-08-161-2/+3
* [SPARK-2983] [PySpark] improve performance of sortByKey()Davies Liu2014-08-131-23/+24
* [PySpark] Add blanklines to Python docstrings so example code renders correctlyRJ Nowling2014-08-061-0/+9
* [SPARK-2627] [PySpark] have the build enforce PEP 8 automaticallyNicholas Chammas2014-08-061-9/+13
* [SPARK-2010] [PySpark] [SQL] support nested structure in SchemaRDDDavies Liu2014-08-011-4/+4
* [SPARK-2024] Add saveAsSequenceFile to PySparkKan Zhang2014-07-301-0/+114
* [SPARK-2601] [PySpark] Fix Py4J error when transforming pickleFilesJosh Rosen2014-07-261-3/+1
* [SPARK-2656] Python version of stratified samplingDoris Xin2014-07-241-2/+23
* [SPARK-2538] [PySpark] Hash based disk spilling aggregationDavies Liu2014-07-241-21/+71
* [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by d...Prashant Sharma2014-07-241-2/+2
* [SPARK-2494] [PySpark] make hash of None consistant cross machinesDavies Liu2014-07-211-3/+32
* Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.Prashant Sharma2014-07-141-58/+92
* [SPARK-2061] Made splits deprecated in JavaRDDLikeAnant2014-06-201-2/+2
* SPARK-1868: Users should be allowed to cogroup at least 4 RDDsAllan Douglas R. de Oliveira2014-06-201-7/+15
* SPARK-2203: PySpark defaults to use same num reduce partitions as map sideAaron Davidson2014-06-201-3/+18
* SPARK-2146. Fix takeOrdered docSandy Ryza2014-06-171-1/+1
* SPARK-1063 Add .sortBy(f) method on RDDAndrew Ash2014-06-171-0/+12
* [SPARK-2130] End-user friendly String repr for StorageLevel in PythonKan Zhang2014-06-161-0/+3
* SPARK-1939 Refactor takeSample method in RDD to use ScaSRSDoris Xin2014-06-121-61/+106
* SPARK-554. Add aggregateByKey.Sandy Ryza2014-06-121-1/+18
* fixed typo in docstring for min()Jeff Thompson2014-06-121-1/+1
* [SPARK-1308] Add getNumPartitions to pyspark RDDSyed Hashmi2014-06-091-18/+27
* [SPARK-1161] Add saveAsPickleFile and SparkContext.pickleFile in PythonKan Zhang2014-06-031-8/+25
* [SPARK-1468] Modify the partition function used by partitionBy.Erik Selin2014-06-031-1/+4
* SPARK-1839: PySpark RDD#take() shouldn't always read from driverAaron Davidson2014-05-311-21/+38
* Documentation: Encourage use of reduceByKey instead of groupByKey.Patrick Wendell2014-05-141-0/+4
* [SPARK-1690] Tolerating empty elements when saving Python RDD to text filesKan Zhang2014-05-101-0/+8
* [SPARK-1674] fix interrupted system call error in pyspark's RDD.pipeXiangrui Meng2014-04-291-3/+3
* SPARK-1242 Add aggregate to python rddHolden Karau2014-04-241-2/+29
* SPARK-1438 RDD.sample() make seed param optionalArun Ramakrishnan2014-04-241-7/+6
* Spark 1271: Co-Group and Group-By should pass Iterable[X]Holden Karau2014-04-081-5/+5
* SPARK-1305: Support persisting RDD's directly to TachyonHaoyuan Li2014-04-041-1/+2
* Spark 1162 Implemented takeOrdered in pyspark.Prashant Sharma2014-04-031-5/+102
* SPARK-1322, top in pyspark should sort result in descending order.Prashant Sharma2014-03-261-3/+3
* Added doctest for map function in rdd.pyJyotiska NK2014-03-191-0/+4