diff options
author | Tathagata Das <tathagata.das1565@gmail.com> | 2014-11-25 23:15:58 -0800 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2014-11-25 23:16:14 -0800 |
commit | e8669729af4b49423a7514830436b2cb4ee6a08a (patch) | |
tree | a0f550e3cf2924e321c12ad0f9c55cd396203afb /core | |
parent | 69d021b0becdffe225a1c8859d8c6adeb1a94f4a (diff) | |
download | spark-e8669729af4b49423a7514830436b2cb4ee6a08a.tar.gz spark-e8669729af4b49423a7514830436b2cb4ee6a08a.tar.bz2 spark-e8669729af4b49423a7514830436b2cb4ee6a08a.zip |
[SPARK-4612] Reduce task latency and increase scheduling throughput by making configuration initialization lazy
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L337 creates a configuration object for every task that is launched, even if there is no new dependent file/JAR to update. This is a heavy-weight creation that should be avoided if there is no new file/JAR to update. This PR makes that creation lazy. Quick local test in spark-perf scheduling throughput tests gives the following numbers in a local standalone scheduler mode.
1 job with 10000 tasks: before 7.8395 seconds, after 2.6415 seconds = 3x increase in task scheduling throughput
pwendell JoshRosen
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes #3463 from tdas/lazy-config and squashes the following commits:
c791c1e [Tathagata Das] Reduce task latency by making configuration initialization lazy
(cherry picked from commit e7f4d2534bb3361ec4b7af0d42bc798a7a425226)
Signed-off-by: Reynold Xin <rxin@databricks.com>
Diffstat (limited to 'core')
-rw-r--r-- | core/src/main/scala/org/apache/spark/executor/Executor.scala | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala b/core/src/main/scala/org/apache/spark/executor/Executor.scala index 5fa584591d..835157fc52 100644 --- a/core/src/main/scala/org/apache/spark/executor/Executor.scala +++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala @@ -334,7 +334,7 @@ private[spark] class Executor( * SparkContext. Also adds any new JARs we fetched to the class loader. */ private def updateDependencies(newFiles: HashMap[String, Long], newJars: HashMap[String, Long]) { - val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf) + lazy val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf) synchronized { // Fetch missing dependencies for ((name, timestamp) <- newFiles if currentFiles.getOrElse(name, -1L) < timestamp) { |