diff options
author | Josh Rosen <joshrosen@databricks.com> | 2015-02-16 15:25:11 -0800 |
---|---|---|
committer | Josh Rosen <joshrosen@databricks.com> | 2015-02-16 15:25:11 -0800 |
commit | 0cfda8461f173428f955aa9a7140b1356beea400 (patch) | |
tree | 809e2c44614f9df6e724f7c9cda2dc16cf69cf59 /docs/running-on-yarn.md | |
parent | c01c4ebcfe5c1a4a56a8987af596eca090c2cc2f (diff) | |
download | spark-0cfda8461f173428f955aa9a7140b1356beea400.tar.gz spark-0cfda8461f173428f955aa9a7140b1356beea400.tar.bz2 spark-0cfda8461f173428f955aa9a7140b1356beea400.zip |
[SPARK-2313] Use socket to communicate GatewayServer port back to Python driver
This patch changes PySpark so that the GatewayServer's port is communicated back to the Python process that launches it over a local socket instead of a pipe. The old pipe-based approach was brittle and could fail if `spark-submit` printed unexpected to stdout.
To accomplish this, I wrote a custom `PythonGatewayServer.main()` function to use in place of Py4J's `GatewayServer.main()`.
Closes #3424.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #4603 from JoshRosen/SPARK-2313 and squashes the following commits:
6a7740b [Josh Rosen] Remove EchoOutputThread since it's no longer needed
0db501f [Josh Rosen] Use select() so that we don't block if GatewayServer dies.
9bdb4b6 [Josh Rosen] Handle case where getListeningPort returns -1
3fb7ed1 [Josh Rosen] Remove stdout=PIPE
2458934 [Josh Rosen] Use underscore to mark env var. as private
d12c95d [Josh Rosen] Use Logging and Utils.tryOrExit()
e5f9730 [Josh Rosen] Wrap everything in a giant try-block
2f70689 [Josh Rosen] Use stdin PIPE to share fate with driver
8bf956e [Josh Rosen] Initial cut at passing Py4J gateway port back to driver via socket
Diffstat (limited to 'docs/running-on-yarn.md')
0 files changed, 0 insertions, 0 deletions