aboutsummaryrefslogtreecommitdiff
path: root/yarn
diff options
context:
space:
mode:
authorAndrew Or <andrewor14@gmail.com>2014-05-07 14:35:22 -0700
committerAaron Davidson <aaron@databricks.com>2014-05-07 14:35:37 -0700
commit82c8e89c9581c45c7878b8f406cf3d90d4b0d74c (patch)
tree09f45d6e9b347420e64c7ef2ef385851c68a54e8 /yarn
parent0759ee790527f61bf9f4bcef4aa0befa1d430370 (diff)
downloadspark-82c8e89c9581c45c7878b8f406cf3d90d4b0d74c.tar.gz
spark-82c8e89c9581c45c7878b8f406cf3d90d4b0d74c.tar.bz2
spark-82c8e89c9581c45c7878b8f406cf3d90d4b0d74c.zip
[SPARK-1688] Propagate PySpark worker stderr to driver
When at least one of the following conditions is true, PySpark cannot be loaded: 1. PYTHONPATH is not set 2. PYTHONPATH does not contain the python directory (or jar, in the case of YARN) 3. The jar does not contain pyspark files (YARN) 4. The jar does not contain py4j files (YARN) However, we currently throw the same random `java.io.EOFException` for all of the above cases, when trying to read from the python daemon's output. This message is super unhelpful. This PR includes the python stderr and the PYTHONPATH in the exception propagated to the driver. Now, the exception message looks something like: ``` Error from python worker: : No module named pyspark PYTHONPATH was: /path/to/spark/python:/path/to/some/jar java.io.EOFException <stack trace> ``` whereas before it was just ``` java.io.EOFException <stack trace> ``` Author: Andrew Or <andrewor14@gmail.com> Closes #603 from andrewor14/pyspark-exception and squashes the following commits: 10d65d3 [Andrew Or] Throwable -> Exception, worker -> daemon 862d1d7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into pyspark-exception a5ed798 [Andrew Or] Use block string and interpolation instead of var (minor) cc09c45 [Andrew Or] Account for the fact that the python daemon may not have terminated yet 444f019 [Andrew Or] Use the new RedirectThread + include system PYTHONPATH aab00ae [Andrew Or] Merge branch 'master' of github.com:apache/spark into pyspark-exception 0cc2402 [Andrew Or] Merge branch 'master' of github.com:apache/spark into pyspark-exception 783efe2 [Andrew Or] Make python daemon stderr indentation consistent 9524172 [Andrew Or] Avoid potential NPE / error stream contention + Move things around 29f9688 [Andrew Or] Add back original exception type e92d36b [Andrew Or] Include python worker stderr in the exception propagated to the driver 7c69360 [Andrew Or] Merge branch 'master' of github.com:apache/spark into pyspark-exception cdbc185 [Andrew Or] Fix python attribute not found exception when PYTHONPATH is not set dcc0353 [Andrew Or] Check both python and system environment variables for PYTHONPATH 6c09c21 [Andrew Or] Validate PYTHONPATH and PySpark modules before starting python workers (cherry picked from commit 5200872243aa5906dc8a06772e61d75f19557aac) Signed-off-by: Aaron Davidson <aaron@databricks.com>
Diffstat (limited to 'yarn')
0 files changed, 0 insertions, 0 deletions