[SPARK-2764] Simplify daemon.py process structure - spark

diff options

author	Josh Rosen <joshrosen@apache.org>	2014-08-01 19:38:21 -0700
committer	Aaron Davidson <aaron@databricks.com>	2014-08-01 19:38:21 -0700
commit	e8e0fd691a06a2887fdcffb2217b96805ace0cb0 (patch)
tree	e75662f9f8cfd5cb616b7f96482162811c8c9816 /examples
parent	a38d3c9efcc0386b52ac4f041920985ae7300e28 (diff)
download	spark-e8e0fd691a06a2887fdcffb2217b96805ace0cb0.tar.gz spark-e8e0fd691a06a2887fdcffb2217b96805ace0cb0.tar.bz2 spark-e8e0fd691a06a2887fdcffb2217b96805ace0cb0.zip

[SPARK-2764] Simplify daemon.py process structure

Curently, daemon.py forks a pool of numProcessors subprocesses, and those processes fork themselves again to create the actual Python worker processes that handle data. I think that this extra layer of indirection is unnecessary and adds a lot of complexity. This commit attempts to remove this middle layer of subprocesses by launching the workers directly from daemon.py. See https://github.com/mesos/spark/pull/563 for the original PR that added daemon.py, where I raise some issues with the current design. Author: Josh Rosen <joshrosen@apache.org> Closes #1680 from JoshRosen/pyspark-daemon and squashes the following commits: 5abbcb9 [Josh Rosen] Replace magic number: 4 -> EINTR 5495dff [Josh Rosen] Throw IllegalStateException if worker launch fails. b79254d [Josh Rosen] Detect failed fork() calls; improve error logging. 282c2c4 [Josh Rosen] Remove daemon.py exit logging, since it caused problems: 8554536 [Josh Rosen] Fix daemon’s shutdown(); log shutdown reason. 4e0fab8 [Josh Rosen] Remove shared-memory exit_flag; don't die on worker death. e9892b4 [Josh Rosen] [WIP] [SPARK-2764] Simplify daemon.py process structure.

Diffstat (limited to 'examples')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: