From de3223872a217c5224ba7136604f6b7753b29108 Mon Sep 17 00:00:00 2001 From: Davies Liu Date: Tue, 18 Aug 2015 22:11:27 -0700 Subject: [SPARK-9705] [DOC] fix docs about Python version cc JoshRosen Author: Davies Liu Closes #8245 from davies/python_doc. --- docs/configuration.md | 6 +++++- docs/programming-guide.md | 12 ++++++++++-- 2 files changed, 15 insertions(+), 3 deletions(-) (limited to 'docs') diff --git a/docs/configuration.md b/docs/configuration.md index 32147098ae..4a6e4dd05b 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1561,7 +1561,11 @@ The following variables can be set in `spark-env.sh`: PYSPARK_PYTHON - Python binary executable to use for PySpark. + Python binary executable to use for PySpark in both driver and workers (default is `python`). + + + PYSPARK_DRIVER_PYTHON + Python binary executable to use for PySpark in driver only (default is PYSPARK_PYTHON). SPARK_LOCAL_IP diff --git a/docs/programming-guide.md b/docs/programming-guide.md index ae712d6274..982c5eabe6 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -85,8 +85,8 @@ import org.apache.spark.SparkConf
-Spark {{site.SPARK_VERSION}} works with Python 2.6 or higher (but not Python 3). It uses the standard CPython interpreter, -so C libraries like NumPy can be used. +Spark {{site.SPARK_VERSION}} works with Python 2.6+ or Python 3.4+. It can use the standard CPython interpreter, +so C libraries like NumPy can be used. It also works with PyPy 2.3+. To run Spark applications in Python, use the `bin/spark-submit` script located in the Spark directory. This script will load Spark's Java/Scala libraries and allow you to submit applications to a cluster. @@ -104,6 +104,14 @@ Finally, you need to import some Spark classes into your program. Add the follow from pyspark import SparkContext, SparkConf {% endhighlight %} +PySpark requires the same minor version of Python in both driver and workers. It uses the default python version in PATH, +you can specify which version of Python you want to use by `PYSPARK_PYTHON`, for example: + +{% highlight bash %} +$ PYSPARK_PYTHON=python3.4 bin/pyspark +$ PYSPARK_PYTHON=/opt/pypy-2.5/bin/pypy bin/spark-submit examples/src/main/python/pi.py +{% endhighlight %} +
-- cgit v1.2.3