diff options
author | Josh Rosen <joshrosen@apache.org> | 2014-10-19 00:31:06 -0700 |
---|---|---|
committer | Josh Rosen <joshrosen@databricks.com> | 2014-10-19 00:35:05 -0700 |
commit | 7e63bb49c526c3f872619ae14e4b5273f4c535e9 (patch) | |
tree | 241f07bb2627381f75b0b3791d0dbbac35baa5ea /docs | |
parent | 05db2da7dc256822cdb602c4821cbb9fb84dac98 (diff) | |
download | spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.tar.gz spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.tar.bz2 spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.zip |
[SPARK-2546] Clone JobConf for each task (branch-1.0 / 1.1 backport)
This patch attempts to fix SPARK-2546 in `branch-1.0` and `branch-1.1`. The underlying problem is that thread-safety issues in Hadoop Configuration objects may cause Spark tasks to get stuck in infinite loops. The approach taken here is to clone a new copy of the JobConf for each task rather than sharing a single copy between tasks. Note that there are still Configuration thread-safety issues that may affect the driver, but these seem much less likely to occur in practice and will be more complex to fix (see discussion on the SPARK-2546 ticket).
This cloning is guarded by a new configuration option (`spark.hadoop.cloneConf`) and is disabled by default in order to avoid unexpected performance regressions for workloads that are unaffected by the Configuration thread-safety issues.
Author: Josh Rosen <joshrosen@apache.org>
Closes #2684 from JoshRosen/jobconf-fix-backport and squashes the following commits:
f14f259 [Josh Rosen] Add configuration option to control cloning of Hadoop JobConf.
b562451 [Josh Rosen] Remove unused jobConfCacheKey field.
dd25697 [Josh Rosen] [SPARK-2546] [1.0 / 1.1 backport] Clone JobConf for each task.
(cherry picked from commit 2cd40db2b3ab5ddcb323fd05c171dbd9025f9e71)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Conflicts:
core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
Diffstat (limited to 'docs')
-rw-r--r-- | docs/configuration.md | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index f0204c640b..96fa1377ec 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -620,6 +620,15 @@ Apart from these, the following properties are also available, and may be useful previous versions of Spark. Simply use Hadoop's FileSystem API to delete output directories by hand.</td> </tr> <tr> + <td><code>spark.hadoop.cloneConf</code></td> + <td>false</td> + <td>If set to true, clones a new Hadoop <code>Configuration</code> object for each task. This + option should be enabled to work around <code>Configuration</code> thread-safety issues (see + <a href="https://issues.apache.org/jira/browse/SPARK-2546">SPARK-2546</a> for more details). + This is disabled by default in order to avoid unexpected performance regressions for jobs that + are not affected by these issues.</td> +</tr> +<tr> <td><code>spark.executor.heartbeatInterval</code></td> <td>10000</td> <td>Interval (milliseconds) between each executor's heartbeats to the driver. Heartbeats let |