diff options
author | Reynold Xin <rxin@apache.org> | 2014-09-06 19:06:30 -0700 |
---|---|---|
committer | Matei Zaharia <matei@databricks.com> | 2014-09-06 19:06:30 -0700 |
commit | 3fb57a0ab3d76fda2301dbe9f2f3fa6743b4ed78 (patch) | |
tree | 12db31afe1ca762b95c5b4862a97f55abf0592e2 /core/src/main/scala | |
parent | 110fb8b24d2454ad7c979c3934dbed87650f17b8 (diff) | |
download | spark-3fb57a0ab3d76fda2301dbe9f2f3fa6743b4ed78.tar.gz spark-3fb57a0ab3d76fda2301dbe9f2f3fa6743b4ed78.tar.bz2 spark-3fb57a0ab3d76fda2301dbe9f2f3fa6743b4ed78.zip |
[SPARK-3353] parent stage should have lower stage id.
Previously parent stages had higher stage id, but parent stages are executed first. This pull request changes the behavior so parent stages would have lower stage id.
For example, command:
```scala
sc.parallelize(1 to 10).map(x=>(x,x)).reduceByKey(_+_).count
```
breaks down into 2 stages.
The old web UI:
![screen shot 2014-09-04 at 12 42 44 am](https://cloud.githubusercontent.com/assets/323388/4146177/60fb4f42-3407-11e4-819f-853eb0e22b25.png)
Web UI with this patch:
![screen shot 2014-09-04 at 12 44 55 am](https://cloud.githubusercontent.com/assets/323388/4146178/62e08e62-3407-11e4-867b-a36b10534464.png)
Author: Reynold Xin <rxin@apache.org>
Closes #2273 from rxin/lower-stage-id and squashes the following commits:
abbb4c6 [Reynold Xin] Fixed SparkListenerSuite.
0e02379 [Reynold Xin] Updated DAGSchedulerSuite.
54ccea3 [Reynold Xin] [SPARK-3353] parent stage should have lower stage id.
Diffstat (limited to 'core/src/main/scala')
-rw-r--r-- | core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala index 2ccc27324a..6fcf9e3154 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala @@ -241,9 +241,9 @@ class DAGScheduler( callSite: CallSite) : Stage = { + val parentStages = getParentStages(rdd, jobId) val id = nextStageId.getAndIncrement() - val stage = - new Stage(id, rdd, numTasks, shuffleDep, getParentStages(rdd, jobId), jobId, callSite) + val stage = new Stage(id, rdd, numTasks, shuffleDep, parentStages, jobId, callSite) stageIdToStage(id) = stage updateJobIdStageIdMaps(jobId, stage) stage |