[SPARK-4612] Reduce task latency and increase scheduling throughput by making configuration initialization lazy

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L337 creates a configuration object for every task that is launched, even if there is no new dependent file/JAR to update. This is a heavy-weight creation that should be avoided if there is no new file/JAR to update. This PR makes that creation lazy. Quick local test in spark-perf scheduling throughput tests gives the following numbers in a local standalone scheduler mode. 1 job with 10000 tasks: before 7.8395 seconds, after 2.6415 seconds = 3x increase in task scheduling throughput pwendell JoshRosen Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #3463 from tdas/lazy-config and squashes the following commits: c791c1e [Tathagata Das] Reduce task latency by making configuration initialization lazy
author: Tathagata Das <tathagata.das1565@gmail.com> 2014-11-25 23:15:58 -0800
committer: Reynold Xin <rxin@databricks.com> 2014-11-25 23:15:58 -0800
commit: e7f4d2534bb3361ec4b7af0d42bc798a7a425226 (patch)
tree: 39f4682876a77e111413a4b21da85fe597a80ff6
parent: 346bc17a2ec8fc9e6eaff90733aa1e8b6b46883e (diff)
download: spark-e7f4d2534bb3361ec4b7af0d42bc798a7a425226.tar.gz
spark-e7f4d2534bb3361ec4b7af0d42bc798a7a425226.tar.bz2
spark-e7f4d2534bb3361ec4b7af0d42bc798a7a425226.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index 5fa584591d..835157fc52 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -334,7 +334,7 @@ private[spark] class Executor(
    * SparkContext. Also adds any new JARs we fetched to the class loader.
    */
   private def updateDependencies(newFiles: HashMap[String, Long], newJars: HashMap[String, Long]) {
-    val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)
+    lazy val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)
     synchronized {
       // Fetch missing dependencies
       for ((name, timestamp) <- newFiles if currentFiles.getOrElse(name, -1L) < timestamp) {
author	Tathagata Das <tathagata.das1565@gmail.com>	2014-11-25 23:15:58 -0800
committer	Reynold Xin <rxin@databricks.com>	2014-11-25 23:15:58 -0800
commit	e7f4d2534bb3361ec4b7af0d42bc798a7a425226 (patch)
tree	39f4682876a77e111413a4b21da85fe597a80ff6
parent	346bc17a2ec8fc9e6eaff90733aa1e8b6b46883e (diff)
download	spark-e7f4d2534bb3361ec4b7af0d42bc798a7a425226.tar.gz spark-e7f4d2534bb3361ec4b7af0d42bc798a7a425226.tar.bz2 spark-e7f4d2534bb3361ec4b7af0d42bc798a7a425226.zip