aboutsummaryrefslogtreecommitdiff
path: root/examples
diff options
context:
space:
mode:
authorTakuya UESHIN <ueshin@happy-camper.st>2016-05-25 13:57:25 -0700
committerKay Ousterhout <kayousterhout@gmail.com>2016-05-25 13:57:25 -0700
commit698ef762f80cf4c84bc7b7cf083aa97d44b87170 (patch)
tree67e7d221728fec151ffe759ffa34eb64cfe31bc1 /examples
parentc875d81a3de3f209b9eb03adf96b7c740b2c7b52 (diff)
downloadspark-698ef762f80cf4c84bc7b7cf083aa97d44b87170.tar.gz
spark-698ef762f80cf4c84bc7b7cf083aa97d44b87170.tar.bz2
spark-698ef762f80cf4c84bc7b7cf083aa97d44b87170.zip
[SPARK-14269][SCHEDULER] Eliminate unnecessary submitStage() call.
## What changes were proposed in this pull request? Currently a method `submitStage()` for waiting stages is called on every iteration of the event loop in `DAGScheduler` to submit all waiting stages, but most of them are not necessary because they are not related to Stage status. The case we should try to submit waiting stages is only when their parent stages are successfully completed. This elimination can improve `DAGScheduler` performance. ## How was this patch tested? Added some checks and other existing tests, and our projects. We have a project bottle-necked by `DAGScheduler`, having about 2000 stages. Before this patch the almost all execution time in `Driver` process was spent to process `submitStage()` of `dag-scheduler-event-loop` thread but after this patch the performance was improved as follows: | | total execution time | `dag-scheduler-event-loop` thread time | `submitStage()` | |--------|---------------------:|---------------------------------------:|----------------:| | Before | 760 sec | 710 sec | 667 sec | | After | 440 sec | 14 sec | 10 sec | Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #12060 from ueshin/issues/SPARK-14269.
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions