aboutsummaryrefslogtreecommitdiff
path: root/core/src/main/scala/spark/api/python/PythonRDD.scala
Commit message (Expand)AuthorAgeFilesLines
* Initial work to rename package to org.apache.sparkMatei Zaharia2013-09-011-344/+0
* Implementing SPARK-878 for PySpark: adding zip and egg files to context and p...Andre Schumacher2013-08-161-1/+8
* Optimize Python take() to not compute entire first partitionMatei Zaharia2013-07-291-28/+36
* Add Apache license headers and LICENSE and NOTICE filesMatei Zaharia2013-07-161-0/+17
* Fixed PySpark perf regression by not using socket.makefile(), and improvedroot2013-07-011-3/+7
* Fix performance bug with new Python code not using buffered streamsroot2013-07-011-16/+17
* use parens when calling method with side-effectsJey Kottalam2013-06-211-2/+2
* Rename PythonWorker to PythonWorkerFactoryJey Kottalam2013-06-211-1/+1
* Prefork Python worker processesJey Kottalam2013-06-211-41/+25
* Add Python timing instrumentationJey Kottalam2013-06-211-0/+12
* Checkpoint commit - compiles and passes a lot of tests - not all though, look...Mridul Muralidharan2013-04-151-0/+2
* Fix overly large thread names in PySparkMatei Zaharia2013-02-261-2/+2
* Renamed "splits" to "partitions"Matei Zaharia2013-02-171-5/+5
* Fetch fewer objects in PySpark's take() method.Josh Rosen2013-02-031-2/+9
* Small fix from last commitPatrick Wendell2013-01-311-1/+1
* Some style cleanupPatrick Wendell2013-01-311-7/+4
* SPARK-673: Capture and re-throw Python exceptionsPatrick Wendell2013-01-311-14/+26
* Don't download files to master's working directory.Josh Rosen2013-01-211-0/+2
* Merge pull request #389 from JoshRosen/python_rdd_checkpointingMatei Zaharia2013-01-201-3/+0
|\
| * Add RDD checkpointing to Python API.Josh Rosen2013-01-201-3/+0
* | Fix PythonPartitioner equality; see SPARK-654.Josh Rosen2013-01-201-5/+0
|/
* Merge branch 'master' into streamingMatei Zaharia2013-01-201-16/+67
|\
| * Added accumulators to PySparkMatei Zaharia2013-01-201-16/+67
* | Disabled checkpoint for PairwiseRDD (pySpark).Tathagata Das2013-01-161-0/+1
* | Merge branch 'master' into streamingTathagata Das2013-01-151-7/+6
|/
* Add mapPartitionsWithSplit() to PySpark.Josh Rosen2013-01-081-0/+5
* Change PySpark RDD.take() to not call iterator().Josh Rosen2013-01-031-0/+4
* Rename top-level 'pyspark' directory to 'python'Josh Rosen2013-01-011-1/+1
* Minor documentation and style fixes for PySpark.Josh Rosen2013-01-011-13/+30
* Update PySpark for compatibility with TaskContext.Josh Rosen2012-12-291-8/+5
* Fix bug (introduced by batching) in PySpark take()Josh Rosen2012-12-281-1/+1
* Mark api.python classes as private; echo Java output to stderr.Josh Rosen2012-12-281-29/+21
* Use filesystem to collect RDDs in PySpark.Josh Rosen2012-12-241-42/+24
* Fix PySpark hash partitioning bug.Josh Rosen2012-10-281-6/+4
* Remove PYTHONPATH from SparkContext's executorEnvs.Josh Rosen2012-10-221-8/+7
* Update Python API for v0.6.0 compatibility.Josh Rosen2012-10-191-7/+11
* Add pipe(), saveAsTextFile(), sc.union() to Python API.Josh Rosen2012-08-271-2/+6
* Simplify Python worker; pipeline the map step of partitionBy().Josh Rosen2012-08-271-27/+7
* Add broadcast variables to Python API.Josh Rosen2012-08-271-17/+26
* Use numpy in Python k-means example.Josh Rosen2012-08-221-1/+7
* Use only cPickle for serialization in Python API.Josh Rosen2012-08-211-44/+148
* Add Python API.Josh Rosen2012-08-181-0/+147