index
:
spark
2.12/build
SPARK-10001-hotfix
SPARK-10001-sigint
SPARK-14511-genjavadoc
SPARK-17647
WIP-SPARK-17647
escape
genjavadoc
macros
master
packages
scala-2.12
value-classes
Mirror of Apache Spark
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
python
/
pyspark
/
rdd.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
[SPARK-4304] [PySpark] Fix sort on empty RDD
Davies Liu
2014-11-07
1
-0
/
+2
*
[SPARK-3886] [PySpark] simplify serializer, use AutoBatchedSerializer by defa...
Davies Liu
2014-11-03
1
-54
/
+37
*
[SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample
Xiangrui Meng
2014-11-03
1
-3
/
+0
*
[SPARK-4150][PySpark] return self in rdd.setName
Xiangrui Meng
2014-10-31
1
-2
/
+2
*
[Spark] RDD take() method: overestimate too much
yingjieMiao
2014-10-13
1
-1
/
+4
*
[SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and buildin...
cocoatomo
2014-10-11
1
-1
/
+1
*
[SPARK-3412] [PySpark] Replace Epydoc with Sphinx to generate Python API docs
Davies Liu
2014-10-07
1
-26
/
+26
*
[SPARK-3773][PySpark][Doc] Sphinx build warning
cocoatomo
2014-10-06
1
-0
/
+1
*
[SPARK-3749] [PySpark] fix bugs in broadcast large closure of RDD
Davies Liu
2014-10-01
1
-3
/
+9
*
[SPARK-3478] [PySpark] Profile the Python tasks
Davies Liu
2014-09-30
1
-2
/
+8
*
Revert "[SPARK-3478] [PySpark] Profile the Python tasks"
Josh Rosen
2014-09-26
1
-8
/
+2
*
[SPARK-3478] [PySpark] Profile the Python tasks
Davies Liu
2014-09-26
1
-2
/
+8
*
[SPARK-546] Add full outer join to RDD and DStream.
Aaron Staple
2014-09-24
1
-2
/
+23
*
[SPARK-3491] [MLlib] [PySpark] use pickle to serialize data in MLlib
Davies Liu
2014-09-19
1
-5
/
+5
*
[SPARK-3554] [PySpark] use broadcast automatically for large closure
Davies Liu
2014-09-18
1
-0
/
+4
*
[SPARK-3519] add distinct(n) to PySpark
Matthew Farrellee
2014-09-16
1
-2
/
+2
*
[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.
Aaron Staple
2014-09-15
1
-55
/
+3
*
[PySpark] Add blank line so that Python RDD.top() docstring renders correctly
RJ Nowling
2014-09-12
1
-0
/
+1
*
SPARK-2978. Transformation with MR shuffle semantics
Sandy Ryza
2014-09-08
1
-0
/
+24
*
[SPARK-2334] fix AttributeError when call PipelineRDD.id()
Davies Liu
2014-09-06
1
-0
/
+6
*
Spark-3406 add a default storage level to python RDD persist API
Holden Karau
2014-09-06
1
-1
/
+6
*
SPARK-3211 .take() is OOM-prone with empty partitions
Andrew Ash
2014-09-05
1
-4
/
+4
*
[SPARK-3309] [PySpark] Put all public API in __all__
Davies Liu
2014-09-03
1
-0
/
+1
*
[SPARK-2871] [PySpark] add countApproxDistinct() API
Davies Liu
2014-09-02
1
-5
/
+34
*
[SPARK-2871] [PySpark] add RDD.lookup(key)
Davies Liu
2014-08-27
1
-132
/
+79
*
[SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey()
Davies Liu
2014-08-26
1
-2
/
+7
*
[SPARK-2871] [PySpark] add histgram() API
Davies Liu
2014-08-26
1
-1
/
+128
*
[SPARK-2871] [PySpark] add zipWithIndex() and zipWithUniqueId()
Davies Liu
2014-08-24
1
-0
/
+47
*
[SPARK-2871] [PySpark] add approx API for RDD
Davies Liu
2014-08-23
1
-0
/
+81
*
[SPARK-2871] [PySpark] add `key` argument for max(), min() and top(n)
Davies Liu
2014-08-23
1
-17
/
+27
*
[SPARK-3141] [PySpark] fix sortByKey() with take()
Davies Liu
2014-08-19
1
-10
/
+8
*
[SPARK-2790] [PySpark] fix zip with serializers which have different batch si...
Davies Liu
2014-08-19
1
-0
/
+25
*
[SPARK-3114] [PySpark] Fix Python UDFs in Spark SQL.
Josh Rosen
2014-08-18
1
-1
/
+1
*
[SPARK-3103] [PySpark] fix saveAsTextFile() with utf-8
Davies Liu
2014-08-18
1
-1
/
+3
*
[SPARK-1065] [PySpark] improve supporting for large broadcast
Davies Liu
2014-08-16
1
-2
/
+3
*
[SPARK-2983] [PySpark] improve performance of sortByKey()
Davies Liu
2014-08-13
1
-23
/
+24
*
[PySpark] Add blanklines to Python docstrings so example code renders correctly
RJ Nowling
2014-08-06
1
-0
/
+9
*
[SPARK-2627] [PySpark] have the build enforce PEP 8 automatically
Nicholas Chammas
2014-08-06
1
-9
/
+13
*
[SPARK-2010] [PySpark] [SQL] support nested structure in SchemaRDD
Davies Liu
2014-08-01
1
-4
/
+4
*
[SPARK-2024] Add saveAsSequenceFile to PySpark
Kan Zhang
2014-07-30
1
-0
/
+114
*
[SPARK-2601] [PySpark] Fix Py4J error when transforming pickleFiles
Josh Rosen
2014-07-26
1
-3
/
+1
*
[SPARK-2656] Python version of stratified sampling
Doris Xin
2014-07-24
1
-2
/
+23
*
[SPARK-2538] [PySpark] Hash based disk spilling aggregation
Davies Liu
2014-07-24
1
-21
/
+71
*
[SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by d...
Prashant Sharma
2014-07-24
1
-2
/
+2
*
[SPARK-2494] [PySpark] make hash of None consistant cross machines
Davies Liu
2014-07-21
1
-3
/
+32
*
Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.
Prashant Sharma
2014-07-14
1
-58
/
+92
*
[SPARK-2061] Made splits deprecated in JavaRDDLike
Anant
2014-06-20
1
-2
/
+2
*
SPARK-1868: Users should be allowed to cogroup at least 4 RDDs
Allan Douglas R. de Oliveira
2014-06-20
1
-7
/
+15
*
SPARK-2203: PySpark defaults to use same num reduce partitions as map side
Aaron Davidson
2014-06-20
1
-3
/
+18
*
SPARK-2146. Fix takeOrdered doc
Sandy Ryza
2014-06-17
1
-1
/
+1
[next]