diff options
author | Reynold Xin <rxin@apache.org> | 2014-09-29 22:56:22 -0700 |
---|---|---|
committer | Reynold Xin <rxin@apache.org> | 2014-09-29 22:56:22 -0700 |
commit | 6b79bfb42580b6bd4c4cd99fb521534a94150693 (patch) | |
tree | d2cf3ee0b180a83062f529530f2de7e13b6c2391 /project/MimaExcludes.scala | |
parent | 210404a56197ad347f1e621ed53ef01327fba2bd (diff) | |
download | spark-6b79bfb42580b6bd4c4cd99fb521534a94150693.tar.gz spark-6b79bfb42580b6bd4c4cd99fb521534a94150693.tar.bz2 spark-6b79bfb42580b6bd4c4cd99fb521534a94150693.zip |
[SPARK-3613] Record only average block size in MapStatus for large stages
This changes the way we send MapStatus from executors back to driver for large stages (>2000 tasks). For large stages, we no longer send one byte per block. Instead, we just send the average block size.
This makes large jobs (tens of thousands of tasks) much more reliable since the driver no longer sends huge amount of data.
Author: Reynold Xin <rxin@apache.org>
Closes #2470 from rxin/mapstatus and squashes the following commits:
822ff54 [Reynold Xin] Code review feedback.
3b86f56 [Reynold Xin] Added MimaExclude.
f89d182 [Reynold Xin] Fixed a bug in MapStatus
6a0401c [Reynold Xin] [SPARK-3613] Record only average block size in MapStatus for large stages.
Diffstat (limited to 'project/MimaExcludes.scala')
-rw-r--r-- | project/MimaExcludes.scala | 5 |
1 files changed, 4 insertions, 1 deletions
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala index 1adfaa18c6..4076ebc6fc 100644 --- a/project/MimaExcludes.scala +++ b/project/MimaExcludes.scala @@ -45,7 +45,10 @@ object MimaExcludes { ProblemFilters.exclude[MissingMethodProblem]( "org.apache.spark.mllib.stat.MultivariateStatisticalSummary.normL1"), ProblemFilters.exclude[MissingMethodProblem]( - "org.apache.spark.mllib.stat.MultivariateStatisticalSummary.normL2") + "org.apache.spark.mllib.stat.MultivariateStatisticalSummary.normL2"), + // MapStatus should be private[spark] + ProblemFilters.exclude[IncompatibleTemplateDefProblem]( + "org.apache.spark.scheduler.MapStatus") ) case v if v.startsWith("1.1") => |