aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark
Commit message (Expand)AuthorAgeFilesLines
* [SPARK-2656] Python version of stratified samplingDoris Xin2014-07-242-5/+50
* [SPARK-2538] [PySpark] Hash based disk spilling aggregationDavies Liu2014-07-244-22/+595
* [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by d...Prashant Sharma2014-07-243-3/+9
* [SPARK-2470] PEP8 fixes to PySparkNicholas Chammas2014-07-2118-97/+127
* [SPARK-2494] [PySpark] make hash of None consistant cross machinesDavies Liu2014-07-211-3/+32
* [SPARK-2552][MLLIB] stabilize logistic function in pysparkXiangrui Meng2014-07-201-1/+4
* follow pep8 None should be compared using is or is notKen Takagiwa2014-07-154-7/+7
* Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.Prashant Sharma2014-07-141-58/+92
* [Minor] Remove unused val in MasterAndrew Or2014-07-111-1/+1
* [SPARK-2376][SQL] Selecting list values inside nested JSON objects raises jav...Yin Huai2014-07-071-10/+14
* [SPARK-1394] Remove SIGCHLD handler in worker subprocessMatthew Farrellee2014-06-281-0/+1
* [SPARK-2242] HOTFIX: pyspark shell hangs on simple jobAndrew Or2014-06-251-8/+13
* [SPARK-2061] Made splits deprecated in JavaRDDLikeAnant2014-06-202-3/+3
* SPARK-1868: Users should be allowed to cogroup at least 4 RDDsAllan Douglas R. de Oliveira2014-06-202-17/+25
* SPARK-2203: PySpark defaults to use same num reduce partitions as map sideAaron Davidson2014-06-201-3/+18
* [SPARK-1466] Raise exception if pyspark Gateway process doesn't start.Kay Ousterhout2014-06-181-4/+11
* [SPARK-2060][SQL] Querying JSON Datasets with SQL and DSL in Spark SQLYin Huai2014-06-171-2/+62
* SPARK-2146. Fix takeOrdered docSandy Ryza2014-06-171-1/+1
* SPARK-1063 Add .sortBy(f) method on RDDAndrew Ash2014-06-171-0/+12
* [SPARK-2130] End-user friendly String repr for StorageLevel in PythonKan Zhang2014-06-162-0/+12
* [SPARK-2010] Support for nested data in PySpark SQLKan Zhang2014-06-161-1/+21
* [SPARK-2079] Support batching when serializing SchemaRDD to PythonKan Zhang2014-06-141-1/+3
* SPARK-1939 Refactor takeSample method in RDD to use ScaSRSDoris Xin2014-06-121-61/+106
* SPARK-554. Add aggregateByKey.Sandy Ryza2014-06-122-1/+33
* fixed typo in docstring for min()Jeff Thompson2014-06-121-1/+1
* HOTFIX: PySpark tests should be order insensitive.Patrick Wendell2014-06-111-4/+4
* [SPARK-2091][MLLIB] use numpy.dot instead of ndarray.dotXiangrui Meng2014-06-111-3/+5
* SPARK-1416: PySpark support for SequenceFile and Hadoop InputFormatsNick Pentreath2014-06-092-0/+282
* [SPARK-1308] Add getNumPartitions to pyspark RDDSyed Hashmi2014-06-091-18/+27
* [SPARK-1752][MLLIB] Standardize text format for vectors and labeled pointsXiangrui Meng2014-06-044-51/+129
* [SPARK-1161] Add saveAsPickleFile and SparkContext.pickleFile in PythonKan Zhang2014-06-032-8/+39
* [SPARK-1468] Modify the partition function used by partitionBy.Erik Selin2014-06-031-1/+4
* [SPARK-1942] Stop clearing spark.driver.port in unit testsSyed Hashmi2014-06-031-4/+0
* SPARK-1917: fix PySpark import of scipy.special functionsUri Laserson2014-05-312-1/+25
* SPARK-1839: PySpark RDD#take() shouldn't always read from driverAaron Davidson2014-05-312-21/+64
* Added doctest and method description in context.pyJyotiska NK2014-05-281-1/+14
* Fix PEP8 violations in Python mllib.Reynold Xin2014-05-258-88/+78
* Python docstring update for sql.py.Reynold Xin2014-05-251-61/+63
* SPARK-1822: Some minor cleanup work on SchemaRDD.count()Reynold Xin2014-05-251-1/+4
* [SPARK-1822] SchemaRDD.count() should use query optimizerKan Zhang2014-05-251-1/+13
* [SPARK-1900 / 1918] PySpark on YARN is brokenAndrew Or2014-05-241-2/+6
* [SPARK-1519] Support minPartitions param of wholeTextFiles() in PySparkKan Zhang2014-05-211-2/+10
* [SPARK-1808] Route bin/pyspark through Spark submitAndrew Or2014-05-162-5/+7
* Documentation: Encourage use of reduceByKey instead of groupByKey.Patrick Wendell2014-05-141-0/+4
* [FIX] do not load defaults when testing SparkConf in pysparkXiangrui Meng2014-05-141-1/+1
* [SQL] Make it possible to create Java/Python SQLContexts from an existing Sca...Michael Armbrust2014-05-131-2/+5
* [SPARK-1690] Tolerating empty elements when saving Python RDD to text filesKan Zhang2014-05-101-0/+8
* Add Python includes to path before depickling broadcast valuesBouke van der Bijl2014-05-101-7/+7
* [SPARK-1743][MLLIB] add loadLibSVMFile and saveAsLibSVMFile to pysparkXiangrui Meng2014-05-072-2/+178
* SPARK-1579: Clean up PythonRDD and avoid swallowing IOExceptionsAaron Davidson2014-05-072-2/+14