spark - Mirror of Apache Spark

	Commit message (Expand)	Author	Age	Files	Lines
*	[SPARK-2656] Python version of stratified sampling	Doris Xin	2014-07-24	2	-5/+50
*	[SPARK-2538] [PySpark] Hash based disk spilling aggregation	Davies Liu	2014-07-24	4	-22/+595
*	[SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by d...	Prashant Sharma	2014-07-24	3	-3/+9
*	[SPARK-2470] PEP8 fixes to PySpark	Nicholas Chammas	2014-07-21	18	-97/+127
*	[SPARK-2494] [PySpark] make hash of None consistant cross machines	Davies Liu	2014-07-21	1	-3/+32
*	[SPARK-2552][MLLIB] stabilize logistic function in pyspark	Xiangrui Meng	2014-07-20	1	-1/+4
*	follow pep8 None should be compared using is or is not	Ken Takagiwa	2014-07-15	4	-7/+7
*	Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.	Prashant Sharma	2014-07-14	1	-58/+92
*	[Minor] Remove unused val in Master	Andrew Or	2014-07-11	1	-1/+1
*	[SPARK-2376][SQL] Selecting list values inside nested JSON objects raises jav...	Yin Huai	2014-07-07	1	-10/+14
*	[SPARK-1394] Remove SIGCHLD handler in worker subprocess	Matthew Farrellee	2014-06-28	1	-0/+1
*	[SPARK-2242] HOTFIX: pyspark shell hangs on simple job	Andrew Or	2014-06-25	1	-8/+13
*	[SPARK-2061] Made splits deprecated in JavaRDDLike	Anant	2014-06-20	2	-3/+3
*	SPARK-1868: Users should be allowed to cogroup at least 4 RDDs	Allan Douglas R. de Oliveira	2014-06-20	2	-17/+25
*	SPARK-2203: PySpark defaults to use same num reduce partitions as map side	Aaron Davidson	2014-06-20	1	-3/+18
*	[SPARK-1466] Raise exception if pyspark Gateway process doesn't start.	Kay Ousterhout	2014-06-18	1	-4/+11
*	[SPARK-2060][SQL] Querying JSON Datasets with SQL and DSL in Spark SQL	Yin Huai	2014-06-17	1	-2/+62
*	SPARK-2146. Fix takeOrdered doc	Sandy Ryza	2014-06-17	1	-1/+1
*	SPARK-1063 Add .sortBy(f) method on RDD	Andrew Ash	2014-06-17	1	-0/+12
*	[SPARK-2130] End-user friendly String repr for StorageLevel in Python	Kan Zhang	2014-06-16	2	-0/+12
*	[SPARK-2010] Support for nested data in PySpark SQL	Kan Zhang	2014-06-16	1	-1/+21
*	[SPARK-2079] Support batching when serializing SchemaRDD to Python	Kan Zhang	2014-06-14	1	-1/+3
*	SPARK-1939 Refactor takeSample method in RDD to use ScaSRS	Doris Xin	2014-06-12	1	-61/+106
*	SPARK-554. Add aggregateByKey.	Sandy Ryza	2014-06-12	2	-1/+33
*	fixed typo in docstring for min()	Jeff Thompson	2014-06-12	1	-1/+1
*	HOTFIX: PySpark tests should be order insensitive.	Patrick Wendell	2014-06-11	1	-4/+4
*	[SPARK-2091][MLLIB] use numpy.dot instead of ndarray.dot	Xiangrui Meng	2014-06-11	1	-3/+5
*	SPARK-1416: PySpark support for SequenceFile and Hadoop InputFormats	Nick Pentreath	2014-06-09	2	-0/+282
*	[SPARK-1308] Add getNumPartitions to pyspark RDD	Syed Hashmi	2014-06-09	1	-18/+27
*	[SPARK-1752][MLLIB] Standardize text format for vectors and labeled points	Xiangrui Meng	2014-06-04	4	-51/+129
*	[SPARK-1161] Add saveAsPickleFile and SparkContext.pickleFile in Python	Kan Zhang	2014-06-03	2	-8/+39
*	[SPARK-1468] Modify the partition function used by partitionBy.	Erik Selin	2014-06-03	1	-1/+4
*	[SPARK-1942] Stop clearing spark.driver.port in unit tests	Syed Hashmi	2014-06-03	1	-4/+0
*	SPARK-1917: fix PySpark import of scipy.special functions	Uri Laserson	2014-05-31	2	-1/+25
*	SPARK-1839: PySpark RDD#take() shouldn't always read from driver	Aaron Davidson	2014-05-31	2	-21/+64
*	Added doctest and method description in context.py	Jyotiska NK	2014-05-28	1	-1/+14
*	Fix PEP8 violations in Python mllib.	Reynold Xin	2014-05-25	8	-88/+78
*	Python docstring update for sql.py.	Reynold Xin	2014-05-25	1	-61/+63
*	SPARK-1822: Some minor cleanup work on SchemaRDD.count()	Reynold Xin	2014-05-25	1	-1/+4
*	[SPARK-1822] SchemaRDD.count() should use query optimizer	Kan Zhang	2014-05-25	1	-1/+13
*	[SPARK-1900 / 1918] PySpark on YARN is broken	Andrew Or	2014-05-24	1	-2/+6
*	[SPARK-1519] Support minPartitions param of wholeTextFiles() in PySpark	Kan Zhang	2014-05-21	1	-2/+10
*	[SPARK-1808] Route bin/pyspark through Spark submit	Andrew Or	2014-05-16	2	-5/+7
*	Documentation: Encourage use of reduceByKey instead of groupByKey.	Patrick Wendell	2014-05-14	1	-0/+4
*	[FIX] do not load defaults when testing SparkConf in pyspark	Xiangrui Meng	2014-05-14	1	-1/+1
*	[SQL] Make it possible to create Java/Python SQLContexts from an existing Sca...	Michael Armbrust	2014-05-13	1	-2/+5
*	[SPARK-1690] Tolerating empty elements when saving Python RDD to text files	Kan Zhang	2014-05-10	1	-0/+8
*	Add Python includes to path before depickling broadcast values	Bouke van der Bijl	2014-05-10	1	-7/+7
*	[SPARK-1743][MLLIB] add loadLibSVMFile and saveAsLibSVMFile to pyspark	Xiangrui Meng	2014-05-07	2	-2/+178
*	SPARK-1579: Clean up PythonRDD and avoid swallowing IOExceptions	Aaron Davidson	2014-05-07	2	-2/+14