index
:
spark
2.12/build
SPARK-10001-hotfix
SPARK-10001-sigint
SPARK-14511-genjavadoc
SPARK-17647
WIP-SPARK-17647
escape
genjavadoc
macros
master
packages
scala-2.12
value-classes
Mirror of Apache Spark
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
python
/
pyspark
/
rdd.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
[SPARK-20232][PYTHON] Improve combineByKey docs
David Gingrich
2017-04-13
1
-5
/
+19
*
[SPARK-19872] [PYTHON] Use the correct deserializer for RDD construction for ...
hyukjinkwon
2017-03-15
1
-1
/
+3
*
[SPARK-13330][PYSPARK] PYTHONHASHSEED is not propgated to python worker
Jeff Zhang
2017-02-24
1
-1
/
+2
*
[SPARK-18281] [SQL] [PYSPARK] Remove timeout for reading data through socket ...
Liang-Chi Hsieh
2016-12-20
1
-6
/
+5
*
[SPARK-18447][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across P...
hyukjinkwon
2016-11-22
1
-26
/
+28
*
[SPARK-18361][PYSPARK] Expose RDD localCheckpoint in PySpark
Gabriel Huang
2016-11-21
1
-1
/
+32
*
[SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`...
hyukjinkwon
2016-11-19
1
-2
/
+2
*
[SPARK-18365][DOCS] Improve Sample Method Documentation
anabranch
2016-11-17
1
-0
/
+5
*
[SPARK-17817] [PYSPARK] [FOLLOWUP] PySpark RDD Repartitioning Results in High...
Liang-Chi Hsieh
2016-10-18
1
-6
/
+6
*
[SPARK-17817][PYSPARK] PySpark RDD Repartitioning Results in Highly Skewed Pa...
Liang-Chi Hsieh
2016-10-11
1
-3
/
+10
*
[SPARK-17679] [PYSPARK] remove unnecessary Py4J ListConverter patch
Jason White
2016-10-03
1
-11
/
+2
*
[MINOR][PYSPARK][DOCS] Fix examples in PySpark documentation
hyukjinkwon
2016-09-28
1
-2
/
+2
*
[DOC] improve python doc for rdd.histogram and dataframe.join
Mortada Mehyar
2016-07-18
1
-9
/
+9
*
[MINOR] Fix Typos 'an -> a'
Zheng RuiFeng
2016-06-06
1
-2
/
+2
*
[SPARK-15136][PYSPARK][DOC] Fix links to sphinx style and add a default param...
Holden Karau
2016-05-09
1
-3
/
+3
*
[SPARK-14368][PYSPARK] Support python.spark.worker.memory with upper-case unit.
Yong Tang
2016-04-05
1
-1
/
+1
*
[SPARK-14334] [SQL] add toLocalIterator for Dataset/DataFrame
Davies Liu
2016-04-04
1
-4
/
+4
*
[SPARK-13467] [PYSPARK] abstract python function to simplify pyspark code
Wenchen Fan
2016-02-24
1
-9
/
+14
*
[SPARK-13339][DOCS] Clarify commutative / associative operator requirements f...
Sean Owen
2016-02-19
1
-4
/
+3
*
[SPARK-5865][API DOC] Add doc warnings for methods that return local data str...
Tommy YU
2016-02-06
1
-0
/
+17
*
[SPARK-7683][PYSPARK] Confusing behavior of fold function of RDD in pyspark
Sean Owen
2016-01-19
1
-1
/
+1
*
[SPARK-12091] [PYSPARK] Deprecate the JAVA-specific deserialized storage levels
gatorsmile
2015-12-18
1
-4
/
+4
*
[SPARK-12090] [PYSPARK] consider shuffle in coalesce()
Davies Liu
2015-12-01
1
-1
/
+1
*
[SPARK-11658] simplify documentation for PySpark combineByKey
Chris Snow
2015-11-12
1
-1
/
+0
*
[SPARK-9821] [PYSPARK] pyspark-reduceByKey-should-take-a-custom-partitioner
Holden Karau
2015-09-21
1
-13
/
+16
*
[SPARK-10710] Remove ability to disable spilling in core and SQL
Josh Rosen
2015-09-19
1
-18
/
+7
*
[SPARK-10642] [PYSPARK] Fix crash when calling rdd.lookup() on tuple keys
Liang-Chi Hsieh
2015-09-17
1
-1
/
+4
*
[SPARK-9828] [PYSPARK] Mutable values should not be default arguments
MechCoder
2015-08-14
1
-1
/
+4
*
[SPARK-9144] Remove DAGScheduler.runLocallyWithinThread and spark.localExecut...
Josh Rosen
2015-07-22
1
-2
/
+2
*
[SPARK-9021] [PYSPARK] Change RDD.aggregate() to do reduce(mapPartitions()) i...
Nicholas Hwang
2015-07-19
1
-2
/
+8
*
[SPARK-7735] [PYSPARK] Raise Exception on non-zero exit from pipe commands
Scott Taylor
2015-07-10
1
-2
/
+14
*
[SPARK-8738] [SQL] [PYSPARK] capture SQL AnalysisException in Python API
Davies Liu
2015-06-30
1
-1
/
+2
*
[SPARK-7810] [PYSPARK] solve python rdd socket connection problem
Ai He
2015-06-29
1
-3
/
+15
*
[SPARK-8541] [PYSPARK] test the absolute error in approx doctests
Scott Taylor
2015-06-22
1
-2
/
+2
*
[SPARK-8373] [PYSPARK] Add emptyRDD to pyspark and fix the issue when calling...
zsxwing
2015-06-17
1
-1
/
+1
*
[SPARK-6416] [DOCS] RDD.fold() requires the operator to be commutative
Sean Owen
2015-05-21
1
-2
/
+10
*
[SPARK-6216] [PYSPARK] check python version of worker with driver
Davies Liu
2015-05-18
1
-2
/
+2
*
[SPARK-7438] [SPARK CORE] Fixed validation of relativeSD in countApproxDistinct
Vinod K C
2015-05-09
1
-2
/
+0
*
[SPARK-6949] [SQL] [PySpark] Support Date/Timestamp in Column expression
Davies Liu
2015-04-21
1
-0
/
+3
*
[SPARK-4897] [PySpark] Python 3 support
Davies Liu
2015-04-16
1
-80
/
+109
*
[SPARK-6886] [PySpark] fix big closure with shuffle
Davies Liu
2015-04-15
1
-10
/
+5
*
[SPARK-6216] [PySpark] check the python version in worker
Davies Liu
2015-04-10
1
-1
/
+1
*
[SPARK-5969][PySpark] Fix descending pyspark.rdd.sortByKey.
Milan Straka
2015-04-10
1
-1
/
+1
*
[SPARK-3074] [PySpark] support groupByKey() with single huge key
Davies Liu
2015-04-09
1
-12
/
+36
*
[SPARK-6667] [PySpark] remove setReuseAddress
Davies Liu
2015-04-02
1
-0
/
+1
*
[SPARK-6370][core] Documentation: Improve all 3 docs for RDD.sample
mbonaci
2015-03-20
1
-0
/
+6
*
[SPARK-6194] [SPARK-677] [PySpark] fix memory leak in collect()
Davies Liu
2015-03-09
1
-16
/
+14
*
[SPARK-5944] [PySpark] fix version in Python API docs
Davies Liu
2015-02-25
1
-0
/
+4
*
[SPARK-5973] [PySpark] fix zip with two RDDs with AutoBatchedSerializer
Davies Liu
2015-02-24
1
-1
/
+1
*
[SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark
Davies Liu
2015-02-17
1
-16
/
+33
[next]