aboutsummaryrefslogtreecommitdiff
path: root/python
Commit message (Collapse)AuthorAgeFilesLines
* Fix Python saveAsTextFile doctest to not expect order to be preservedJey Kottalam2013-04-021-1/+1
|
* Fix argv handling in Python transitive closure exampleJey Kottalam2013-04-021-1/+1
|
* Change numSplits to numPartitions in PySpark.Josh Rosen2013-02-242-38/+38
|
* Add commutative requirement for 'reduce' to Python docstring.Mark Hamstra2013-02-091-2/+2
|
* Remove unnecessary doctest __main__ methods.Josh Rosen2013-02-032-18/+0
|
* Fetch fewer objects in PySpark's take() method.Josh Rosen2013-02-031-0/+4
|
* Fix reporting of PySpark doctest failures.Josh Rosen2013-02-032-2/+6
|
* Use spark.local.dir for PySpark temp files (SPARK-580).Josh Rosen2013-02-012-10/+9
|
* Do not launch JavaGateways on workers (SPARK-674).Josh Rosen2013-02-014-18/+25
| | | | | | | | | | | The problem was that the gateway was being initialized whenever the pyspark.context module was loaded. The fix uses lazy initialization that occurs only when SparkContext instances are actually constructed. I also made the gateway and jvm variables private. This change results in ~3-4x performance improvement when running the PySpark unit tests.
* Fix stdout redirection in PySpark.Josh Rosen2013-02-012-2/+12
|
* SPARK-673: Capture and re-throw Python exceptionsPatrick Wendell2013-01-311-2/+8
| | | | | This patch alters the Python <-> executor protocol to pass on exception data when they occur in user Python code.
* Merge pull request #430 from pwendell/pyspark-guideMatei Zaharia2013-01-301-0/+1
|\ | | | | Minor improvements to PySpark docs
| * Make module help available in python shell.Patrick Wendell2013-01-301-0/+1
| | | | | | | | Also, adds a line in doc explaining how to use.
* | Replace old 'master' term with 'driver'.Stephen Haberman2013-01-251-1/+1
| |
* | Merge pull request #396 from JoshRosen/spark-653Matei Zaharia2013-01-242-14/+29
|\ \ | | | | | | Make PySpark AccumulatorParam an abstract base class
| * | Remove use of abc.ABCMeta due to cloudpickle issue.Josh Rosen2013-01-231-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | cloudpickle runs into issues while pickling subclasses of AccumulatorParam, which may be related to this Python issue: http://bugs.python.org/issue7689 This seems hard to fix and the ABCMeta wasn't necessary, so I removed it.
| * | Make AccumulatorParam an abstract base class.Josh Rosen2013-01-212-13/+31
| | |
* | | Allow PySpark's SparkFiles to be used from driverJosh Rosen2013-01-235-9/+63
| | | | | | | | | | | | Fix minor documentation formatting issues.
* | | Fix sys.path bug in PySpark SparkContext.addPyFileJosh Rosen2013-01-224-7/+41
| | |
* | | Don't download files to master's working directory.Josh Rosen2013-01-215-5/+70
|/ / | | | | | | | | | | | | This should avoid exceptions caused by existing files with different contents. I also removed some unused code.
* | Merge pull request #389 from JoshRosen/python_rdd_checkpointingMatei Zaharia2013-01-205-3/+116
|\ \ | | | | | | Add checkpointing to the Python API
| * | Clean up setup code in PySpark checkpointing testsJosh Rosen2013-01-202-16/+6
| | |
| * | Update checkpointing API docs in Python/Java.Josh Rosen2013-01-202-16/+12
| | |
| * | Add checkpointFile() and more tests to PySpark.Josh Rosen2013-01-203-2/+37
| | |
| * | Add RDD checkpointing to Python API.Josh Rosen2013-01-205-1/+93
| | |
* | | Fix PythonPartitioner equality; see SPARK-654.Josh Rosen2013-01-201-6/+11
|/ / | | | | | | | | | | PythonPartitioner did not take the Python-side partitioning function into account when checking for equality, which might cause problems in the future.
* / Add __repr__ to Accumulator; fix bug in sc.accumulatorJosh Rosen2013-01-201-1/+10
|/
* Merge pull request #387 from mateiz/python-accumulatorsJosh Rosen2013-01-208-5/+238
|\ | | | | Add accumulators to PySpark
| * Add a class comment to AccumulatorMatei Zaharia2013-01-201-0/+12
| |
| * Launch accumulator tests in run-testsMatei Zaharia2013-01-201-0/+3
| |
| * Added accumulators to PySparkMatei Zaharia2013-01-207-5/+223
| |
* | Minor formatting fixesMatei Zaharia2013-01-201-1/+1
| |
* | Python ALS exampleNick Pentreath2013-01-151-0/+71
|/
* Change PYSPARK_PYTHON_EXEC to PYSPARK_PYTHON.Josh Rosen2013-01-101-1/+1
|
* Use take() instead of takeSample() in PySpark kmeans example.Josh Rosen2013-01-091-1/+3
| | | | This is a temporary change until we port takeSample().
* Indicate success/failure in PySpark test script.Josh Rosen2013-01-091-0/+17
|
* Add mapPartitionsWithSplit() to PySpark.Josh Rosen2013-01-082-12/+25
|
* Change PySpark RDD.take() to not call iterator().Josh Rosen2013-01-032-6/+6
|
* Add `pyspark` script to replace the other scripts.Josh Rosen2013-01-012-26/+19
| | | Expand the PySpark programming guide.
* Rename top-level 'pyspark' directory to 'python'Josh Rosen2013-01-0121-0/+2442
|
* Fix Python 2.6 compatibility in Python API.Josh Rosen2012-09-171-22/+0
|
* Add Python API.Josh Rosen2012-08-181-0/+22