spark - Mirror of Apache Spark

	Commit message (Expand)	Author	Age	Files	Lines
*	Initial work to rename package to org.apache.spark	Matei Zaharia	2013-09-01	1	-344/+0
*	Implementing SPARK-878 for PySpark: adding zip and egg files to context and p...	Andre Schumacher	2013-08-16	1	-1/+8
*	Optimize Python take() to not compute entire first partition	Matei Zaharia	2013-07-29	1	-28/+36
*	Add Apache license headers and LICENSE and NOTICE files	Matei Zaharia	2013-07-16	1	-0/+17
*	Fixed PySpark perf regression by not using socket.makefile(), and improved	root	2013-07-01	1	-3/+7
*	Fix performance bug with new Python code not using buffered streams	root	2013-07-01	1	-16/+17
*	use parens when calling method with side-effects	Jey Kottalam	2013-06-21	1	-2/+2
*	Rename PythonWorker to PythonWorkerFactory	Jey Kottalam	2013-06-21	1	-1/+1
*	Prefork Python worker processes	Jey Kottalam	2013-06-21	1	-41/+25
*	Add Python timing instrumentation	Jey Kottalam	2013-06-21	1	-0/+12
*	Checkpoint commit - compiles and passes a lot of tests - not all though, look...	Mridul Muralidharan	2013-04-15	1	-0/+2
*	Fix overly large thread names in PySpark	Matei Zaharia	2013-02-26	1	-2/+2
*	Renamed "splits" to "partitions"	Matei Zaharia	2013-02-17	1	-5/+5
*	Fetch fewer objects in PySpark's take() method.	Josh Rosen	2013-02-03	1	-2/+9
*	Small fix from last commit	Patrick Wendell	2013-01-31	1	-1/+1
*	Some style cleanup	Patrick Wendell	2013-01-31	1	-7/+4
*	SPARK-673: Capture and re-throw Python exceptions	Patrick Wendell	2013-01-31	1	-14/+26
*	Don't download files to master's working directory.	Josh Rosen	2013-01-21	1	-0/+2
*	Merge pull request #389 from JoshRosen/python_rdd_checkpointing	Matei Zaharia	2013-01-20	1	-3/+0
\|\
\| *	Add RDD checkpointing to Python API.	Josh Rosen	2013-01-20	1	-3/+0
* \|	Fix PythonPartitioner equality; see SPARK-654.	Josh Rosen	2013-01-20	1	-5/+0
\|/
*	Merge branch 'master' into streaming	Matei Zaharia	2013-01-20	1	-16/+67
\|\
\| *	Added accumulators to PySpark	Matei Zaharia	2013-01-20	1	-16/+67
* \|	Disabled checkpoint for PairwiseRDD (pySpark).	Tathagata Das	2013-01-16	1	-0/+1
* \|	Merge branch 'master' into streaming	Tathagata Das	2013-01-15	1	-7/+6
\|/
*	Add mapPartitionsWithSplit() to PySpark.	Josh Rosen	2013-01-08	1	-0/+5
*	Change PySpark RDD.take() to not call iterator().	Josh Rosen	2013-01-03	1	-0/+4
*	Rename top-level 'pyspark' directory to 'python'	Josh Rosen	2013-01-01	1	-1/+1
*	Minor documentation and style fixes for PySpark.	Josh Rosen	2013-01-01	1	-13/+30
*	Update PySpark for compatibility with TaskContext.	Josh Rosen	2012-12-29	1	-8/+5
*	Fix bug (introduced by batching) in PySpark take()	Josh Rosen	2012-12-28	1	-1/+1
*	Mark api.python classes as private; echo Java output to stderr.	Josh Rosen	2012-12-28	1	-29/+21
*	Use filesystem to collect RDDs in PySpark.	Josh Rosen	2012-12-24	1	-42/+24
*	Fix PySpark hash partitioning bug.	Josh Rosen	2012-10-28	1	-6/+4
*	Remove PYTHONPATH from SparkContext's executorEnvs.	Josh Rosen	2012-10-22	1	-8/+7
*	Update Python API for v0.6.0 compatibility.	Josh Rosen	2012-10-19	1	-7/+11
*	Add pipe(), saveAsTextFile(), sc.union() to Python API.	Josh Rosen	2012-08-27	1	-2/+6
*	Simplify Python worker; pipeline the map step of partitionBy().	Josh Rosen	2012-08-27	1	-27/+7
*	Add broadcast variables to Python API.	Josh Rosen	2012-08-27	1	-17/+26
*	Use numpy in Python k-means example.	Josh Rosen	2012-08-22	1	-1/+7
*	Use only cPickle for serialization in Python API.	Josh Rosen	2012-08-21	1	-44/+148
*	Add Python API.	Josh Rosen	2012-08-18	1	-0/+147