index
:
spark
2.12/build
SPARK-10001-hotfix
SPARK-10001-sigint
SPARK-14511-genjavadoc
SPARK-17647
WIP-SPARK-17647
escape
genjavadoc
macros
master
packages
scala-2.12
value-classes
Mirror of Apache Spark
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
python
/
pyspark
Commit message (
Expand
)
Author
Age
Files
Lines
*
[SPARK-2179][SQL] Public API for DataTypes and Schema
Yin Huai
2014-07-30
1
-14
/
+553
*
[SPARK-2674] [SQL] [PySpark] support datetime type for SchemaRDD
Davies Liu
2014-07-29
1
-10
/
+12
*
[SPARK-791] [PySpark] fix pickle itemgetter with cloudpickle
Davies Liu
2014-07-29
2
-2
/
+9
*
[SPARK-2580] [PySpark] keep silent in worker if JVM close the socket
Davies Liu
2014-07-29
2
-8
/
+19
*
[SPARK-1550] [PySpark] Allow SparkContext creation after failed attempts
Josh Rosen
2014-07-27
2
-6
/
+18
*
[SPARK-2679] [MLLib] Ser/De for Double
Doris Xin
2014-07-27
1
-3
/
+45
*
[SPARK-2601] [PySpark] Fix Py4J error when transforming pickleFiles
Josh Rosen
2014-07-26
2
-3
/
+10
*
[SPARK-2652] [PySpark] Turning some default configs for PySpark
Davies Liu
2014-07-26
1
-1
/
+12
*
[SPARK-1458] [PySpark] Expose sc.version in Java and PySpark
Josh Rosen
2014-07-26
1
-0
/
+7
*
[SPARK-2656] Python version of stratified sampling
Doris Xin
2014-07-24
2
-5
/
+50
*
[SPARK-2538] [PySpark] Hash based disk spilling aggregation
Davies Liu
2014-07-24
4
-22
/
+595
*
[SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by d...
Prashant Sharma
2014-07-24
3
-3
/
+9
*
[SPARK-2470] PEP8 fixes to PySpark
Nicholas Chammas
2014-07-21
18
-97
/
+127
*
[SPARK-2494] [PySpark] make hash of None consistant cross machines
Davies Liu
2014-07-21
1
-3
/
+32
*
[SPARK-2552][MLLIB] stabilize logistic function in pyspark
Xiangrui Meng
2014-07-20
1
-1
/
+4
*
follow pep8 None should be compared using is or is not
Ken Takagiwa
2014-07-15
4
-7
/
+7
*
Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.
Prashant Sharma
2014-07-14
1
-58
/
+92
*
[Minor] Remove unused val in Master
Andrew Or
2014-07-11
1
-1
/
+1
*
[SPARK-2376][SQL] Selecting list values inside nested JSON objects raises jav...
Yin Huai
2014-07-07
1
-10
/
+14
*
[SPARK-1394] Remove SIGCHLD handler in worker subprocess
Matthew Farrellee
2014-06-28
1
-0
/
+1
*
[SPARK-2242] HOTFIX: pyspark shell hangs on simple job
Andrew Or
2014-06-25
1
-8
/
+13
*
[SPARK-2061] Made splits deprecated in JavaRDDLike
Anant
2014-06-20
2
-3
/
+3
*
SPARK-1868: Users should be allowed to cogroup at least 4 RDDs
Allan Douglas R. de Oliveira
2014-06-20
2
-17
/
+25
*
SPARK-2203: PySpark defaults to use same num reduce partitions as map side
Aaron Davidson
2014-06-20
1
-3
/
+18
*
[SPARK-1466] Raise exception if pyspark Gateway process doesn't start.
Kay Ousterhout
2014-06-18
1
-4
/
+11
*
[SPARK-2060][SQL] Querying JSON Datasets with SQL and DSL in Spark SQL
Yin Huai
2014-06-17
1
-2
/
+62
*
SPARK-2146. Fix takeOrdered doc
Sandy Ryza
2014-06-17
1
-1
/
+1
*
SPARK-1063 Add .sortBy(f) method on RDD
Andrew Ash
2014-06-17
1
-0
/
+12
*
[SPARK-2130] End-user friendly String repr for StorageLevel in Python
Kan Zhang
2014-06-16
2
-0
/
+12
*
[SPARK-2010] Support for nested data in PySpark SQL
Kan Zhang
2014-06-16
1
-1
/
+21
*
[SPARK-2079] Support batching when serializing SchemaRDD to Python
Kan Zhang
2014-06-14
1
-1
/
+3
*
SPARK-1939 Refactor takeSample method in RDD to use ScaSRS
Doris Xin
2014-06-12
1
-61
/
+106
*
SPARK-554. Add aggregateByKey.
Sandy Ryza
2014-06-12
2
-1
/
+33
*
fixed typo in docstring for min()
Jeff Thompson
2014-06-12
1
-1
/
+1
*
HOTFIX: PySpark tests should be order insensitive.
Patrick Wendell
2014-06-11
1
-4
/
+4
*
[SPARK-2091][MLLIB] use numpy.dot instead of ndarray.dot
Xiangrui Meng
2014-06-11
1
-3
/
+5
*
SPARK-1416: PySpark support for SequenceFile and Hadoop InputFormats
Nick Pentreath
2014-06-09
2
-0
/
+282
*
[SPARK-1308] Add getNumPartitions to pyspark RDD
Syed Hashmi
2014-06-09
1
-18
/
+27
*
[SPARK-1752][MLLIB] Standardize text format for vectors and labeled points
Xiangrui Meng
2014-06-04
4
-51
/
+129
*
[SPARK-1161] Add saveAsPickleFile and SparkContext.pickleFile in Python
Kan Zhang
2014-06-03
2
-8
/
+39
*
[SPARK-1468] Modify the partition function used by partitionBy.
Erik Selin
2014-06-03
1
-1
/
+4
*
[SPARK-1942] Stop clearing spark.driver.port in unit tests
Syed Hashmi
2014-06-03
1
-4
/
+0
*
SPARK-1917: fix PySpark import of scipy.special functions
Uri Laserson
2014-05-31
2
-1
/
+25
*
SPARK-1839: PySpark RDD#take() shouldn't always read from driver
Aaron Davidson
2014-05-31
2
-21
/
+64
*
Added doctest and method description in context.py
Jyotiska NK
2014-05-28
1
-1
/
+14
*
Fix PEP8 violations in Python mllib.
Reynold Xin
2014-05-25
8
-88
/
+78
*
Python docstring update for sql.py.
Reynold Xin
2014-05-25
1
-61
/
+63
*
SPARK-1822: Some minor cleanup work on SchemaRDD.count()
Reynold Xin
2014-05-25
1
-1
/
+4
*
[SPARK-1822] SchemaRDD.count() should use query optimizer
Kan Zhang
2014-05-25
1
-1
/
+13
*
[SPARK-1900 / 1918] PySpark on YARN is broken
Andrew Or
2014-05-24
1
-2
/
+6
[next]