aboutsummaryrefslogtreecommitdiff
path: root/examples/src
diff options
context:
space:
mode:
authorJosh Rosen <joshrosen@apache.org>2014-08-01 19:38:21 -0700
committerAaron Davidson <aaron@databricks.com>2014-08-01 19:38:21 -0700
commite8e0fd691a06a2887fdcffb2217b96805ace0cb0 (patch)
treee75662f9f8cfd5cb616b7f96482162811c8c9816 /examples/src
parenta38d3c9efcc0386b52ac4f041920985ae7300e28 (diff)
downloadspark-e8e0fd691a06a2887fdcffb2217b96805ace0cb0.tar.gz
spark-e8e0fd691a06a2887fdcffb2217b96805ace0cb0.tar.bz2
spark-e8e0fd691a06a2887fdcffb2217b96805ace0cb0.zip
[SPARK-2764] Simplify daemon.py process structure
Curently, daemon.py forks a pool of numProcessors subprocesses, and those processes fork themselves again to create the actual Python worker processes that handle data. I think that this extra layer of indirection is unnecessary and adds a lot of complexity. This commit attempts to remove this middle layer of subprocesses by launching the workers directly from daemon.py. See https://github.com/mesos/spark/pull/563 for the original PR that added daemon.py, where I raise some issues with the current design. Author: Josh Rosen <joshrosen@apache.org> Closes #1680 from JoshRosen/pyspark-daemon and squashes the following commits: 5abbcb9 [Josh Rosen] Replace magic number: 4 -> EINTR 5495dff [Josh Rosen] Throw IllegalStateException if worker launch fails. b79254d [Josh Rosen] Detect failed fork() calls; improve error logging. 282c2c4 [Josh Rosen] Remove daemon.py exit logging, since it caused problems: 8554536 [Josh Rosen] Fix daemon’s shutdown(); log shutdown reason. 4e0fab8 [Josh Rosen] Remove shared-memory exit_flag; don't die on worker death. e9892b4 [Josh Rosen] [WIP] [SPARK-2764] Simplify daemon.py process structure.
Diffstat (limited to 'examples/src')
0 files changed, 0 insertions, 0 deletions