diff options
author | Tejas Patil <tejasp@fb.com> | 2016-09-15 10:23:41 -0700 |
---|---|---|
committer | Shixiong Zhu <shixiong@databricks.com> | 2016-09-15 10:23:41 -0700 |
commit | b479278142728eb003b9ee466fab0e8d6ec4b13d (patch) | |
tree | 7e83576cf49a9eef9e9e99bf966ec7b98c0c027f /mesos | |
parent | 2ad276954858b0a7b3f442b9e440c72cbb1610e2 (diff) | |
download | spark-b479278142728eb003b9ee466fab0e8d6ec4b13d.tar.gz spark-b479278142728eb003b9ee466fab0e8d6ec4b13d.tar.bz2 spark-b479278142728eb003b9ee466fab0e8d6ec4b13d.zip |
[SPARK-17451][CORE] CoarseGrainedExecutorBackend should inform driver before self-kill
## What changes were proposed in this pull request?
Jira : https://issues.apache.org/jira/browse/SPARK-17451
`CoarseGrainedExecutorBackend` in some failure cases exits the JVM. While this does not have any issue, from the driver UI there is no specific reason captured for this. In this PR, I am adding functionality to `exitExecutor` to notify driver that the executor is exiting.
## How was this patch tested?
Ran the change over a test env and took down shuffle service before the executor could register to it. In the driver logs, where the job failure reason is mentioned (ie. `Job aborted due to stage ...` it gives the correct reason:
Before:
`ExecutorLostFailure (executor ZZZZZZZZZ exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.`
After:
`ExecutorLostFailure (executor ZZZZZZZZZ exited caused by one of the running tasks) Reason: Unable to create executor due to java.util.concurrent.TimeoutException: Timeout waiting for task.`
Author: Tejas Patil <tejasp@fb.com>
Closes #15013 from tejasapatil/SPARK-17451_inform_driver.
Diffstat (limited to 'mesos')
0 files changed, 0 insertions, 0 deletions