[SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor - spark

diff options

author	Tejas Patil <tejasp@fb.com>	2016-07-15 14:27:16 -0700
committer	Shixiong Zhu <shixiong@databricks.com>	2016-07-15 14:27:16 -0700
commit	b2f24f94591082d3ff82bd3db1760b14603b38aa (patch)
tree	8cc3f4f5fffd814aba5dc07288ae353c50a8503f /sql/catalyst/src
parent	611a8ca5895357059f1e7c035d946e0718b26a5a (diff)
download	spark-b2f24f94591082d3ff82bd3db1760b14603b38aa.tar.gz spark-b2f24f94591082d3ff82bd3db1760b14603b38aa.tar.bz2 spark-b2f24f94591082d3ff82bd3db1760b14603b38aa.zip

[SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor

## What changes were proposed in this pull request? With the fix from SPARK-13112, I see that `LaunchTask` is always processed after `RegisteredExecutor` is done and so it gets chance to do all retries to startup an executor. There is still a problem that if `Executor` creation itself fails and there is some exception, it gets unnoticed and the executor is killed when it tries to process the `LaunchTask` as `executor` is null : https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala#L88 So if one looks at the logs, it does not tell that there was problem during `Executor` creation and thats why it was killed. This PR explicitly catches exception in `Executor` creation, logs a proper message and then exits the JVM. Also, I have changed the `exitExecutor` method to accept `reason` so that backends can use that reason and do stuff like logging to a DB to get an aggregate of such exits at a cluster level ## How was this patch tested? I am relying on existing tests Author: Tejas Patil <tejasp@fb.com> Closes #14202 from tejasapatil/exit_executor_failure.

Diffstat (limited to 'sql/catalyst/src')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: