index
:
spark
2.12/build
SPARK-10001-hotfix
SPARK-10001-sigint
SPARK-14511-genjavadoc
SPARK-17647
WIP-SPARK-17647
escape
genjavadoc
macros
master
packages
scala-2.12
value-classes
Mirror of Apache Spark
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
python
/
pyspark
/
rdd.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
[SPARK-6416] [DOCS] RDD.fold() requires the operator to be commutative
Sean Owen
2015-05-21
1
-2
/
+10
*
[SPARK-6216] [PYSPARK] check python version of worker with driver
Davies Liu
2015-05-18
1
-2
/
+2
*
[SPARK-7438] [SPARK CORE] Fixed validation of relativeSD in countApproxDistinct
Vinod K C
2015-05-09
1
-2
/
+0
*
[SPARK-6949] [SQL] [PySpark] Support Date/Timestamp in Column expression
Davies Liu
2015-04-21
1
-0
/
+3
*
[SPARK-4897] [PySpark] Python 3 support
Davies Liu
2015-04-16
1
-80
/
+109
*
[SPARK-6886] [PySpark] fix big closure with shuffle
Davies Liu
2015-04-15
1
-10
/
+5
*
[SPARK-6216] [PySpark] check the python version in worker
Davies Liu
2015-04-10
1
-1
/
+1
*
[SPARK-5969][PySpark] Fix descending pyspark.rdd.sortByKey.
Milan Straka
2015-04-10
1
-1
/
+1
*
[SPARK-3074] [PySpark] support groupByKey() with single huge key
Davies Liu
2015-04-09
1
-12
/
+36
*
[SPARK-6667] [PySpark] remove setReuseAddress
Davies Liu
2015-04-02
1
-0
/
+1
*
[SPARK-6370][core] Documentation: Improve all 3 docs for RDD.sample
mbonaci
2015-03-20
1
-0
/
+6
*
[SPARK-6194] [SPARK-677] [PySpark] fix memory leak in collect()
Davies Liu
2015-03-09
1
-16
/
+14
*
[SPARK-5944] [PySpark] fix version in Python API docs
Davies Liu
2015-02-25
1
-0
/
+4
*
[SPARK-5973] [PySpark] fix zip with two RDDs with AutoBatchedSerializer
Davies Liu
2015-02-24
1
-1
/
+1
*
[SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark
Davies Liu
2015-02-17
1
-16
/
+33
*
SPARK-5633 pyspark saveAsTextFile support for compression codec
Vladimir Vladimirov
2015-02-06
1
-2
/
+20
*
[SPARK-5577] Python udf for DataFrame
Davies Liu
2015-02-04
1
-16
/
+22
*
[SPARK-5430] move treeReduce and treeAggregate from mllib to core
Xiangrui Meng
2015-01-28
1
-1
/
+90
*
[SPARK-4387][PySpark] Refactoring python profiling code to make it extensible
Yandu Oppacher
2015-01-28
1
-6
/
+9
*
[SPARK-5440][pyspark] Add toLocalIterator to pyspark rdd
Michael Nazario
2015-01-28
1
-0
/
+14
*
SPARK-5458. Refer to aggregateByKey instead of combineByKey in docs
Sandy Ryza
2015-01-28
1
-2
/
+2
*
[SPARK-5063] More helpful error messages for several invalid operations
Josh Rosen
2015-01-23
1
-0
/
+11
*
SPARK-5270 [CORE] Provide isEmpty() function in RDD API
Sean Owen
2015-01-19
1
-0
/
+12
*
[SPARK-4822] Use sphinx tags for Python doc annotations
lewuathe
2014-12-17
1
-4
/
+4
*
[SPARK-4841] fix zip with textFile()
Davies Liu
2014-12-15
1
-14
/
+11
*
[SPARK-4477] [PySpark] remove numpy from RDDSampler
Davies Liu
2014-11-20
1
-4
/
+6
*
[SPARK-4327] [PySpark] Python API for RDD.randomSplit()
Davies Liu
2014-11-18
1
-3
/
+27
*
[SPARK-4304] [PySpark] Fix sort on empty RDD
Davies Liu
2014-11-07
1
-0
/
+2
*
[SPARK-3886] [PySpark] simplify serializer, use AutoBatchedSerializer by defa...
Davies Liu
2014-11-03
1
-54
/
+37
*
[SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample
Xiangrui Meng
2014-11-03
1
-3
/
+0
*
[SPARK-4150][PySpark] return self in rdd.setName
Xiangrui Meng
2014-10-31
1
-2
/
+2
*
[Spark] RDD take() method: overestimate too much
yingjieMiao
2014-10-13
1
-1
/
+4
*
[SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and buildin...
cocoatomo
2014-10-11
1
-1
/
+1
*
[SPARK-3412] [PySpark] Replace Epydoc with Sphinx to generate Python API docs
Davies Liu
2014-10-07
1
-26
/
+26
*
[SPARK-3773][PySpark][Doc] Sphinx build warning
cocoatomo
2014-10-06
1
-0
/
+1
*
[SPARK-3749] [PySpark] fix bugs in broadcast large closure of RDD
Davies Liu
2014-10-01
1
-3
/
+9
*
[SPARK-3478] [PySpark] Profile the Python tasks
Davies Liu
2014-09-30
1
-2
/
+8
*
Revert "[SPARK-3478] [PySpark] Profile the Python tasks"
Josh Rosen
2014-09-26
1
-8
/
+2
*
[SPARK-3478] [PySpark] Profile the Python tasks
Davies Liu
2014-09-26
1
-2
/
+8
*
[SPARK-546] Add full outer join to RDD and DStream.
Aaron Staple
2014-09-24
1
-2
/
+23
*
[SPARK-3491] [MLlib] [PySpark] use pickle to serialize data in MLlib
Davies Liu
2014-09-19
1
-5
/
+5
*
[SPARK-3554] [PySpark] use broadcast automatically for large closure
Davies Liu
2014-09-18
1
-0
/
+4
*
[SPARK-3519] add distinct(n) to PySpark
Matthew Farrellee
2014-09-16
1
-2
/
+2
*
[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.
Aaron Staple
2014-09-15
1
-55
/
+3
*
[PySpark] Add blank line so that Python RDD.top() docstring renders correctly
RJ Nowling
2014-09-12
1
-0
/
+1
*
SPARK-2978. Transformation with MR shuffle semantics
Sandy Ryza
2014-09-08
1
-0
/
+24
*
[SPARK-2334] fix AttributeError when call PipelineRDD.id()
Davies Liu
2014-09-06
1
-0
/
+6
*
Spark-3406 add a default storage level to python RDD persist API
Holden Karau
2014-09-06
1
-1
/
+6
*
SPARK-3211 .take() is OOM-prone with empty partitions
Andrew Ash
2014-09-05
1
-4
/
+4
*
[SPARK-3309] [PySpark] Put all public API in __all__
Davies Liu
2014-09-03
1
-0
/
+1
[next]