diff options
author | Marcelo Vanzin <vanzin@cloudera.com> | 2015-08-18 11:36:36 -0700 |
---|---|---|
committer | Marcelo Vanzin <vanzin@cloudera.com> | 2015-08-18 11:36:36 -0700 |
commit | c1840a862eb548bc4306e53ee7e9f26986b31832 (patch) | |
tree | 594e4fe1248f97e772dd842e68cef7e228871ff9 | |
parent | 354f4582b637fa25d3892ec2b12869db50ed83c9 (diff) | |
download | spark-c1840a862eb548bc4306e53ee7e9f26986b31832.tar.gz spark-c1840a862eb548bc4306e53ee7e9f26986b31832.tar.bz2 spark-c1840a862eb548bc4306e53ee7e9f26986b31832.zip |
[SPARK-7736] [CORE] Fix a race introduced in PythonRunner.
The fix for SPARK-7736 introduced a race where a port value of "-1"
could be passed down to the pyspark process, causing it to fail to
connect back to the JVM. This change adds code to fix that race.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes #8258 from vanzin/SPARK-7736.
-rw-r--r-- | core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala | 8 |
1 files changed, 7 insertions, 1 deletions
diff --git a/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala b/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala index 4277ac2ad1..23d01e9cbb 100644 --- a/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala +++ b/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala @@ -52,10 +52,16 @@ object PythonRunner { gatewayServer.start() } }) - thread.setName("py4j-gateway") + thread.setName("py4j-gateway-init") thread.setDaemon(true) thread.start() + // Wait until the gateway server has started, so that we know which port is it bound to. + // `gatewayServer.start()` will start a new thread and run the server code there, after + // initializing the socket, so the thread started above will end as soon as the server is + // ready to serve connections. + thread.join() + // Build up a PYTHONPATH that includes the Spark assembly JAR (where this class is), the // python directories in SPARK_HOME (if set), and any files in the pyFiles argument val pathElements = new ArrayBuffer[String] |