aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorWenchen Fan <wenchen@databricks.com>2016-04-29 19:01:38 -0700
committerReynold Xin <rxin@databricks.com>2016-04-29 19:01:38 -0700
commitb056e8cb0a7c58c3e4d199af3ee13be50305b747 (patch)
tree545b7f0d2de5e807b7a18c3d8cae4af7b184a3f0
parentb33d6b72886db35ea042e29a8c08cd73bf9d4b0c (diff)
downloadspark-b056e8cb0a7c58c3e4d199af3ee13be50305b747.tar.gz
spark-b056e8cb0a7c58c3e4d199af3ee13be50305b747.tar.bz2
spark-b056e8cb0a7c58c3e4d199af3ee13be50305b747.zip
[SPARK-15010][CORE] new accumulator shoule be tolerant of local RPC message delivery
## What changes were proposed in this pull request? The RPC framework will not serialize and deserialize messages in local mode, we should not call `acc.value` when receive heartbeat message, because the serialization hook of new accumulator may not be triggered and the `atDriverSide` flag may not be set. ## How was this patch tested? tested it locally via spark shell Author: Wenchen Fan <wenchen@databricks.com> Closes #12795 from cloud-fan/bug.
-rw-r--r--core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala9
1 files changed, 7 insertions, 2 deletions
diff --git a/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala b/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
index 776a3226cc..8fa4aa121c 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
@@ -389,9 +389,14 @@ private[spark] class TaskSchedulerImpl(
// (taskId, stageId, stageAttemptId, accumUpdates)
val accumUpdatesWithTaskIds: Array[(Long, Int, Int, Seq[AccumulableInfo])] = synchronized {
accumUpdates.flatMap { case (id, updates) =>
+ // We should call `acc.value` here as we are at driver side now. However, the RPC framework
+ // optimizes local message delivery so that messages do not need to de serialized and
+ // deserialized. This brings trouble to the accumulator framework, which depends on
+ // serialization to set the `atDriverSide` flag. Here we call `acc.localValue` instead to
+ // be more robust about this issue.
+ val accInfos = updates.map(acc => acc.toInfo(Some(acc.localValue), None))
taskIdToTaskSetManager.get(id).map { taskSetMgr =>
- (id, taskSetMgr.stageId, taskSetMgr.taskSet.stageAttemptId,
- updates.map(acc => acc.toInfo(Some(acc.value), None)))
+ (id, taskSetMgr.stageId, taskSetMgr.taskSet.stageAttemptId, accInfos)
}
}
}