diff options
author | Sean Owen <sowen@cloudera.com> | 2016-07-06 13:36:07 -0700 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2016-07-06 13:36:07 -0700 |
commit | a8f89df3b391e7a3fa9f73d9ec730d6eaa95bb09 (patch) | |
tree | e516635c27bf5e371200e66deb38506894fca223 /core/src | |
parent | 040f6f9f468f153e4c4db78c26ced0299245fb6f (diff) | |
download | spark-a8f89df3b391e7a3fa9f73d9ec730d6eaa95bb09.tar.gz spark-a8f89df3b391e7a3fa9f73d9ec730d6eaa95bb09.tar.bz2 spark-a8f89df3b391e7a3fa9f73d9ec730d6eaa95bb09.zip |
[SPARK-16379][CORE][MESOS] Spark on mesos is broken due to race condition in Logging
## What changes were proposed in this pull request?
The commit https://github.com/apache/spark/commit/044971eca0ff3c2ce62afa665dbd3072d52cbbec introduced a lazy val to simplify code in Logging. Simple enough, though one side effect is that accessing log now means grabbing the instance's lock. This in turn turned up a form of deadlock in the Mesos code. It was arguably a bit of a problem in how this code is structured, but, in any event the safest thing to do seems to be to revert the commit, and that's 90% of the change here; it's just not worth the risk of similar more subtle issues.
What I didn't revert here was the removal of this odd override of log in the Mesos code. In retrospect it might have been put in place at some stage as a defense against this type of problem. After all the Logging code still involved a lock at initialization before the change in question.
Even after the revert, it doesn't seem like it does anything, given how Logging works now, so I left it removed. However, I also removed the particular log message that ended up playing a part in this problem anyway, maybe being paranoid, to make sure this type of problem can't happen even with how the current locking works in logging initialization.
## How was this patch tested?
Jenkins tests
Author: Sean Owen <sowen@cloudera.com>
Closes #14069 from srowen/SPARK-16379.
Diffstat (limited to 'core/src')
-rw-r--r-- | core/src/main/scala/org/apache/spark/internal/Logging.scala | 14 | ||||
-rw-r--r-- | core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala | 1 |
2 files changed, 10 insertions, 5 deletions
diff --git a/core/src/main/scala/org/apache/spark/internal/Logging.scala b/core/src/main/scala/org/apache/spark/internal/Logging.scala index c51050c13d..66a0cfec62 100644 --- a/core/src/main/scala/org/apache/spark/internal/Logging.scala +++ b/core/src/main/scala/org/apache/spark/internal/Logging.scala @@ -32,10 +32,7 @@ private[spark] trait Logging { // Make the log field transient so that objects with Logging can // be serialized and used on another machine - @transient lazy val log: Logger = { - initializeLogIfNecessary(false) - LoggerFactory.getLogger(logName) - } + @transient private var log_ : Logger = null // Method to get the logger name for this object protected def logName = { @@ -43,6 +40,15 @@ private[spark] trait Logging { this.getClass.getName.stripSuffix("$") } + // Method to get or create the logger for this object + protected def log: Logger = { + if (log_ == null) { + initializeLogIfNecessary(false) + log_ = LoggerFactory.getLogger(logName) + } + log_ + } + // Log methods that take only a String protected def logInfo(msg: => String) { if (log.isInfoEnabled) log.info(msg) diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala index e88e4ad475..99e6d39583 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala @@ -244,7 +244,6 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( d: org.apache.mesos.SchedulerDriver, frameworkId: FrameworkID, masterInfo: MasterInfo) { appId = frameworkId.getValue mesosExternalShuffleClient.foreach(_.init(appId)) - logInfo("Registered as framework ID " + appId) markRegistered() } |