Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Remove unnecessary doctest __main__ methods. | Josh Rosen | 2013-02-03 | 2 | -18/+0 |
| | |||||
* | Fetch fewer objects in PySpark's take() method. | Josh Rosen | 2013-02-03 | 1 | -0/+4 |
| | |||||
* | Fix reporting of PySpark doctest failures. | Josh Rosen | 2013-02-03 | 2 | -2/+6 |
| | |||||
* | Use spark.local.dir for PySpark temp files (SPARK-580). | Josh Rosen | 2013-02-01 | 2 | -10/+9 |
| | |||||
* | Do not launch JavaGateways on workers (SPARK-674). | Josh Rosen | 2013-02-01 | 4 | -18/+25 |
| | | | | | | | | | | | The problem was that the gateway was being initialized whenever the pyspark.context module was loaded. The fix uses lazy initialization that occurs only when SparkContext instances are actually constructed. I also made the gateway and jvm variables private. This change results in ~3-4x performance improvement when running the PySpark unit tests. | ||||
* | Fix stdout redirection in PySpark. | Josh Rosen | 2013-02-01 | 2 | -2/+12 |
| | |||||
* | SPARK-673: Capture and re-throw Python exceptions | Patrick Wendell | 2013-01-31 | 1 | -2/+8 |
| | | | | | This patch alters the Python <-> executor protocol to pass on exception data when they occur in user Python code. | ||||
* | Merge pull request #430 from pwendell/pyspark-guide | Matei Zaharia | 2013-01-30 | 1 | -0/+1 |
|\ | | | | | Minor improvements to PySpark docs | ||||
| * | Make module help available in python shell. | Patrick Wendell | 2013-01-30 | 1 | -0/+1 |
| | | | | | | | | Also, adds a line in doc explaining how to use. | ||||
* | | Replace old 'master' term with 'driver'. | Stephen Haberman | 2013-01-25 | 1 | -1/+1 |
| | | |||||
* | | Merge pull request #396 from JoshRosen/spark-653 | Matei Zaharia | 2013-01-24 | 2 | -14/+29 |
|\ \ | | | | | | | Make PySpark AccumulatorParam an abstract base class | ||||
| * | | Remove use of abc.ABCMeta due to cloudpickle issue. | Josh Rosen | 2013-01-23 | 1 | -7/+4 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | cloudpickle runs into issues while pickling subclasses of AccumulatorParam, which may be related to this Python issue: http://bugs.python.org/issue7689 This seems hard to fix and the ABCMeta wasn't necessary, so I removed it. | ||||
| * | | Make AccumulatorParam an abstract base class. | Josh Rosen | 2013-01-21 | 2 | -13/+31 |
| | | | |||||
* | | | Allow PySpark's SparkFiles to be used from driver | Josh Rosen | 2013-01-23 | 5 | -9/+63 |
| | | | | | | | | | | | | Fix minor documentation formatting issues. | ||||
* | | | Fix sys.path bug in PySpark SparkContext.addPyFile | Josh Rosen | 2013-01-22 | 4 | -7/+41 |
| | | | |||||
* | | | Don't download files to master's working directory. | Josh Rosen | 2013-01-21 | 5 | -5/+70 |
|/ / | | | | | | | | | | | | | This should avoid exceptions caused by existing files with different contents. I also removed some unused code. | ||||
* | | Merge pull request #389 from JoshRosen/python_rdd_checkpointing | Matei Zaharia | 2013-01-20 | 5 | -3/+116 |
|\ \ | | | | | | | Add checkpointing to the Python API | ||||
| * | | Clean up setup code in PySpark checkpointing tests | Josh Rosen | 2013-01-20 | 2 | -16/+6 |
| | | | |||||
| * | | Update checkpointing API docs in Python/Java. | Josh Rosen | 2013-01-20 | 2 | -16/+12 |
| | | | |||||
| * | | Add checkpointFile() and more tests to PySpark. | Josh Rosen | 2013-01-20 | 3 | -2/+37 |
| | | | |||||
| * | | Add RDD checkpointing to Python API. | Josh Rosen | 2013-01-20 | 5 | -1/+93 |
| | | | |||||
* | | | Fix PythonPartitioner equality; see SPARK-654. | Josh Rosen | 2013-01-20 | 1 | -6/+11 |
|/ / | | | | | | | | | | | PythonPartitioner did not take the Python-side partitioning function into account when checking for equality, which might cause problems in the future. | ||||
* / | Add __repr__ to Accumulator; fix bug in sc.accumulator | Josh Rosen | 2013-01-20 | 1 | -1/+10 |
|/ | |||||
* | Merge pull request #387 from mateiz/python-accumulators | Josh Rosen | 2013-01-20 | 8 | -5/+238 |
|\ | | | | | Add accumulators to PySpark | ||||
| * | Add a class comment to Accumulator | Matei Zaharia | 2013-01-20 | 1 | -0/+12 |
| | | |||||
| * | Launch accumulator tests in run-tests | Matei Zaharia | 2013-01-20 | 1 | -0/+3 |
| | | |||||
| * | Added accumulators to PySpark | Matei Zaharia | 2013-01-20 | 7 | -5/+223 |
| | | |||||
* | | Minor formatting fixes | Matei Zaharia | 2013-01-20 | 1 | -1/+1 |
| | | |||||
* | | Python ALS example | Nick Pentreath | 2013-01-15 | 1 | -0/+71 |
|/ | |||||
* | Change PYSPARK_PYTHON_EXEC to PYSPARK_PYTHON. | Josh Rosen | 2013-01-10 | 1 | -1/+1 |
| | |||||
* | Use take() instead of takeSample() in PySpark kmeans example. | Josh Rosen | 2013-01-09 | 1 | -1/+3 |
| | | | | This is a temporary change until we port takeSample(). | ||||
* | Indicate success/failure in PySpark test script. | Josh Rosen | 2013-01-09 | 1 | -0/+17 |
| | |||||
* | Add mapPartitionsWithSplit() to PySpark. | Josh Rosen | 2013-01-08 | 2 | -12/+25 |
| | |||||
* | Change PySpark RDD.take() to not call iterator(). | Josh Rosen | 2013-01-03 | 2 | -6/+6 |
| | |||||
* | Add `pyspark` script to replace the other scripts. | Josh Rosen | 2013-01-01 | 2 | -26/+19 |
| | | | Expand the PySpark programming guide. | ||||
* | Rename top-level 'pyspark' directory to 'python' | Josh Rosen | 2013-01-01 | 21 | -0/+2442 |
| | |||||
* | Fix Python 2.6 compatibility in Python API. | Josh Rosen | 2012-09-17 | 1 | -22/+0 |
| | |||||
* | Add Python API. | Josh Rosen | 2012-08-18 | 1 | -0/+22 |