diff options
author | Tejas Patil <tejasp@fb.com> | 2016-07-15 14:27:16 -0700 |
---|---|---|
committer | Shixiong Zhu <shixiong@databricks.com> | 2016-07-15 14:27:16 -0700 |
commit | b2f24f94591082d3ff82bd3db1760b14603b38aa (patch) | |
tree | 8cc3f4f5fffd814aba5dc07288ae353c50a8503f /sql/catalyst/src | |
parent | 611a8ca5895357059f1e7c035d946e0718b26a5a (diff) | |
download | spark-b2f24f94591082d3ff82bd3db1760b14603b38aa.tar.gz spark-b2f24f94591082d3ff82bd3db1760b14603b38aa.tar.bz2 spark-b2f24f94591082d3ff82bd3db1760b14603b38aa.zip |
[SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor
## What changes were proposed in this pull request?
With the fix from SPARK-13112, I see that `LaunchTask` is always processed after `RegisteredExecutor` is done and so it gets chance to do all retries to startup an executor. There is still a problem that if `Executor` creation itself fails and there is some exception, it gets unnoticed and the executor is killed when it tries to process the `LaunchTask` as `executor` is null : https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala#L88 So if one looks at the logs, it does not tell that there was problem during `Executor` creation and thats why it was killed.
This PR explicitly catches exception in `Executor` creation, logs a proper message and then exits the JVM. Also, I have changed the `exitExecutor` method to accept `reason` so that backends can use that reason and do stuff like logging to a DB to get an aggregate of such exits at a cluster level
## How was this patch tested?
I am relying on existing tests
Author: Tejas Patil <tejasp@fb.com>
Closes #14202 from tejasapatil/exit_executor_failure.
Diffstat (limited to 'sql/catalyst/src')
0 files changed, 0 insertions, 0 deletions