diff options
author | cocoatomo <cocoatomo77@gmail.com> | 2014-10-02 11:13:19 -0700 |
---|---|---|
committer | Josh Rosen <joshrosen@apache.org> | 2014-10-02 11:13:19 -0700 |
commit | 5b4a5b1acdc439a58aa2a3561ac0e3fb09f529d6 (patch) | |
tree | 9cc7c9e6c186f7411a3e3a6a0dd515253097271a /docs | |
parent | 6e27cb630de69fa5acb510b4e2f6b980742b1957 (diff) | |
download | spark-5b4a5b1acdc439a58aa2a3561ac0e3fb09f529d6.tar.gz spark-5b4a5b1acdc439a58aa2a3561ac0e3fb09f529d6.tar.bz2 spark-5b4a5b1acdc439a58aa2a3561ac0e3fb09f529d6.zip |
[SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset
### Problem
The section "Using the shell" in Spark Programming Guide (https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) says that we can run pyspark REPL through IPython.
But a folloing command does not run IPython but a default Python executable.
```
$ IPYTHON=1 ./bin/pyspark
Python 2.7.8 (default, Jul 2 2014, 10:14:46)
...
```
the spark/bin/pyspark script on the commit b235e013638685758885842dc3268e9800af3678 decides which executable and options it use folloing way.
1. if PYSPARK_PYTHON unset
* → defaulting to "python"
2. if IPYTHON_OPTS set
* → set IPYTHON "1"
3. some python scripts passed to ./bin/pyspak → run it with ./bin/spark-submit
* out of this issues scope
4. if IPYTHON set as "1"
* → execute $PYSPARK_PYTHON (default: ipython) with arguments $IPYTHON_OPTS
* otherwise execute $PYSPARK_PYTHON
Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON is "1".
In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no effect on decide which command to use.
PYSPARK_PYTHON | IPYTHON_OPTS | IPYTHON | resulting command | expected command
---- | ---- | ----- | ----- | -----
(unset → defaults to python) | (unset) | (unset) | python | (same)
(unset → defaults to python) | (unset) | 1 | python | ipython
(unset → defaults to python) | an_option | (unset → set to 1) | python an_option | ipython an_option
(unset → defaults to python) | an_option | 1 | python an_option | ipython an_option
ipython | (unset) | (unset) | ipython | (same)
ipython | (unset) | 1 | ipython | (same)
ipython | an_option | (unset → set to 1) | ipython an_option | (same)
ipython | an_option | 1 | ipython an_option | (same)
### Suggestion
The pyspark script should determine firstly whether a user wants to run IPython or other executables.
1. if IPYTHON_OPTS set
* set IPYTHON "1"
2. if IPYTHON has a value "1"
* PYSPARK_PYTHON defaults to "ipython" if not set
3. PYSPARK_PYTHON defaults to "python" if not set
See the pull request for more detailed modification.
Author: cocoatomo <cocoatomo77@gmail.com>
Closes #2554 from cocoatomo/issues/cannot-run-ipython-without-options and squashes the following commits:
d2a9b06 [cocoatomo] [SPARK-3706][PySpark] Use PYTHONUNBUFFERED environment variable instead of -u option
264114c [cocoatomo] [SPARK-3706][PySpark] Remove the sentence about deprecated environment variables
42e02d5 [cocoatomo] [SPARK-3706][PySpark] Replace environment variables used to customize execution of PySpark REPL
10d56fb [cocoatomo] [SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset
Diffstat (limited to 'docs')
-rw-r--r-- | docs/programming-guide.md | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md index 1d61a3c555..8e8cc1dd98 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -211,17 +211,17 @@ For a complete list of options, run `pyspark --help`. Behind the scenes, It is also possible to launch the PySpark shell in [IPython](http://ipython.org), the enhanced Python interpreter. PySpark works with IPython 1.0.0 and later. To -use IPython, set the `IPYTHON` variable to `1` when running `bin/pyspark`: +use IPython, set the `PYSPARK_PYTHON` variable to `ipython` when running `bin/pyspark`: {% highlight bash %} -$ IPYTHON=1 ./bin/pyspark +$ PYSPARK_PYTHON=ipython ./bin/pyspark {% endhighlight %} -You can customize the `ipython` command by setting `IPYTHON_OPTS`. For example, to launch +You can customize the `ipython` command by setting `PYSPARK_PYTHON_OPTS`. For example, to launch the [IPython Notebook](http://ipython.org/notebook.html) with PyLab plot support: {% highlight bash %} -$ IPYTHON_OPTS="notebook --pylab inline" ./bin/pyspark +$ PYSPARK_PYTHON=ipython PYSPARK_PYTHON_OPTS="notebook --pylab inline" ./bin/pyspark {% endhighlight %} </div> |