diff options
author | Bryan Cutler <cutlerb@gmail.com> | 2016-09-11 10:19:39 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-09-11 10:19:39 +0100 |
commit | c76baff0cc4775c2191d075cc9a8176e4915fec8 (patch) | |
tree | 550e9f9d7b013365daddac0743d511968aba509d /sbin/spark-config.sh | |
parent | bf22217377d7fe95b436d8b563c501aab2797f78 (diff) | |
download | spark-c76baff0cc4775c2191d075cc9a8176e4915fec8.tar.gz spark-c76baff0cc4775c2191d075cc9a8176e4915fec8.tar.bz2 spark-c76baff0cc4775c2191d075cc9a8176e4915fec8.zip |
[SPARK-17336][PYSPARK] Fix appending multiple times to PYTHONPATH from spark-config.sh
## What changes were proposed in this pull request?
During startup of Spark standalone, the script file spark-config.sh appends to the PYTHONPATH and can be sourced many times, causing duplicates in the path. This change adds a env flag that is set when the PYTHONPATH is appended so it will happen only one time.
## How was this patch tested?
Manually started standalone master/worker and verified PYTHONPATH has no duplicate entries.
Author: Bryan Cutler <cutlerb@gmail.com>
Closes #15028 from BryanCutler/fix-duplicate-pythonpath-SPARK-17336.
Diffstat (limited to 'sbin/spark-config.sh')
-rwxr-xr-x | sbin/spark-config.sh | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/sbin/spark-config.sh b/sbin/spark-config.sh index a7a44cdde6..b7284487c5 100755 --- a/sbin/spark-config.sh +++ b/sbin/spark-config.sh @@ -26,5 +26,8 @@ fi export SPARK_CONF_DIR="${SPARK_CONF_DIR:-"${SPARK_HOME}/conf"}" # Add the PySpark classes to the PYTHONPATH: -export PYTHONPATH="${SPARK_HOME}/python:${PYTHONPATH}" -export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.3-src.zip:${PYTHONPATH}" +if [ -z "${PYSPARK_PYTHONPATH_SET}" ]; then + export PYTHONPATH="${SPARK_HOME}/python:${PYTHONPATH}" + export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.3-src.zip:${PYTHONPATH}" + export PYSPARK_PYTHONPATH_SET=1 +fi |