diff options
author | Eren Avsarogullari <erenavsarogullari@gmail.com> | 2016-10-24 15:33:02 -0700 |
---|---|---|
committer | Kay Ousterhout <kayousterhout@gmail.com> | 2016-10-24 15:33:54 -0700 |
commit | 81d6933e75579343b1dd14792c18149e97e92cdd (patch) | |
tree | 37977c7774c766b846f6ffa33708270091c68b93 /.gitattributes | |
parent | 4ecbe1b92f4c4c5b2d734895c09d8ded0ed48d4d (diff) | |
download | spark-81d6933e75579343b1dd14792c18149e97e92cdd.tar.gz spark-81d6933e75579343b1dd14792c18149e97e92cdd.tar.bz2 spark-81d6933e75579343b1dd14792c18149e97e92cdd.zip |
[SPARK-17894][CORE] Ensure uniqueness of TaskSetManager name.
`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](https://github.com/apache/spark/pull/15326)
**Proposal** :
There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`.
**Current TaskSetManager Name** :
`var name = "TaskSet_" + taskSet.stageId.toString`
**Sample**: TaskSet_0
**Proposed TaskSetManager Name** :
`val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)`
**Sample** : TaskSet_0.0
Added new Unit Test.
Author: erenavsarogullari <erenavsarogullari@gmail.com>
Closes #15463 from erenavsarogullari/SPARK-17894.
Diffstat (limited to '.gitattributes')
0 files changed, 0 insertions, 0 deletions