[SPARK-2546] Clone JobConf for each task (branch-1.0 / 1.1 backport)

This patch attempts to fix SPARK-2546 in `branch-1.0` and `branch-1.1`. The underlying problem is that thread-safety issues in Hadoop Configuration objects may cause Spark tasks to get stuck in infinite loops. The approach taken here is to clone a new copy of the JobConf for each task rather than sharing a single copy between tasks. Note that there are still Configuration thread-safety issues that may affect the driver, but these seem much less likely to occur in practice and will be more complex to fix (see discussion on the SPARK-2546 ticket). This cloning is guarded by a new configuration option (`spark.hadoop.cloneConf`) and is disabled by default in order to avoid unexpected performance regressions for workloads that are unaffected by the Configuration thread-safety issues. Author: Josh Rosen <joshrosen@apache.org> Closes #2684 from JoshRosen/jobconf-fix-backport and squashes the following commits: f14f259 [Josh Rosen] Add configuration option to control cloning of Hadoop JobConf. b562451 [Josh Rosen] Remove unused jobConfCacheKey field. dd25697 [Josh Rosen] [SPARK-2546] [1.0 / 1.1 backport] Clone JobConf for each task. (cherry picked from commit 2cd40db2b3ab5ddcb323fd05c171dbd9025f9e71) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
author: Josh Rosen <joshrosen@apache.org> 2014-10-19 00:31:06 -0700
committer: Josh Rosen <joshrosen@databricks.com> 2014-10-19 00:35:05 -0700
commit: 7e63bb49c526c3f872619ae14e4b5273f4c535e9 (patch)
tree: 241f07bb2627381f75b0b3791d0dbbac35baa5ea /docs/configuration.md
parent: 05db2da7dc256822cdb602c4821cbb9fb84dac98 (diff)
download: spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.tar.gz
spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.tar.bz2
spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.zip
1 files changed, 9 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index f0204c640b..96fa1377ec 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -620,6 +620,15 @@ Apart from these, the following properties are also available, and may be useful
     previous versions of Spark. Simply use Hadoop's FileSystem API to delete output directories by hand.</td>
 </tr>
 <tr>
+    <td><code>spark.hadoop.cloneConf</code></td>
+    <td>false</td>
+    <td>If set to true, clones a new Hadoop <code>Configuration</code> object for each task.  This
+    option should be enabled to work around <code>Configuration</code> thread-safety issues (see
+    <a href="https://issues.apache.org/jira/browse/SPARK-2546">SPARK-2546</a> for more details).
+    This is disabled by default in order to avoid unexpected performance regressions for jobs that
+    are not affected by these issues.</td>
+</tr>
+<tr>
     <td><code>spark.executor.heartbeatInterval</code></td>
     <td>10000</td>
     <td>Interval (milliseconds) between each executor's heartbeats to the driver.  Heartbeats let
author	Josh Rosen <joshrosen@apache.org>	2014-10-19 00:31:06 -0700
committer	Josh Rosen <joshrosen@databricks.com>	2014-10-19 00:35:05 -0700
commit	7e63bb49c526c3f872619ae14e4b5273f4c535e9 (patch)
tree	241f07bb2627381f75b0b3791d0dbbac35baa5ea /docs/configuration.md
parent	05db2da7dc256822cdb602c4821cbb9fb84dac98 (diff)
download	spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.tar.gz spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.tar.bz2 spark-7e63bb49c526c3f872619ae14e4b5273f4c535e9.zip