aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorli-zhihui <zhihui.li@intel.com>2014-07-14 15:32:49 -0500
committerThomas Graves <tgraves@apache.org>2014-07-14 15:32:49 -0500
commit3dd8af7a6623201c28231f4b71f59ea4e9ae29bf (patch)
tree35a9569374a32a9a297d3ea4322732d528be7a6c /docs
parentd60b09bb60cff106fa0acddebf35714503b20f03 (diff)
downloadspark-3dd8af7a6623201c28231f4b71f59ea4e9ae29bf.tar.gz
spark-3dd8af7a6623201c28231f4b71f59ea4e9ae29bf.tar.bz2
spark-3dd8af7a6623201c28231f4b71f59ea4e9ae29bf.zip
[SPARK-1946] Submit tasks after (configured ratio) executors have been registered
Because submitting tasks and registering executors are asynchronous, in most situation, early stages' tasks run without preferred locality. A simple solution is sleeping few seconds in application, so that executors have enough time to register. The PR add 2 configuration properties to make TaskScheduler submit tasks after a few of executors have been registered. \# Submit tasks only after (registered executors / total executors) arrived the ratio, default value is 0 spark.scheduler.minRegisteredExecutorsRatio = 0.8 \# Whatever minRegisteredExecutorsRatio is arrived, submit tasks after the maxRegisteredWaitingTime(millisecond), default value is 30000 spark.scheduler.maxRegisteredExecutorsWaitingTime = 5000 Author: li-zhihui <zhihui.li@intel.com> Closes #900 from li-zhihui/master and squashes the following commits: b9f8326 [li-zhihui] Add logs & edit docs 1ac08b1 [li-zhihui] Add new configs to user docs 22ead12 [li-zhihui] Move waitBackendReady to postStartHook c6f0522 [li-zhihui] Bug fix: numExecutors wasn't set & use constant DEFAULT_NUMBER_EXECUTORS 4d6d847 [li-zhihui] Move waitBackendReady to TaskSchedulerImpl.start & some code refactor 0ecee9a [li-zhihui] Move waitBackendReady from DAGScheduler.submitStage to TaskSchedulerImpl.submitTasks 4261454 [li-zhihui] Add docs for new configs & code style ce0868a [li-zhihui] Code style, rename configuration property name of minRegisteredRatio & maxRegisteredWaitingTime 6cfb9ec [li-zhihui] Code style, revert default minRegisteredRatio of yarn to 0, driver get --num-executors in yarn/alpha 812c33c [li-zhihui] Fix driver lost --num-executors option in yarn-cluster mode e7b6272 [li-zhihui] support yarn-cluster 37f7dc2 [li-zhihui] support yarn mode(percentage style) 3f8c941 [li-zhihui] submit stage after (configured ratio of) executors have been registered
Diffstat (limited to 'docs')
-rw-r--r--docs/configuration.md19
1 files changed, 19 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 0aea23ab59..07aa4c0354 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -699,6 +699,25 @@ Apart from these, the following properties are also available, and may be useful
(in milliseconds)
</td>
</tr>
+</tr>
+ <td><code>spark.scheduler.minRegisteredExecutorsRatio</code></td>
+ <td>0</td>
+ <td>
+ The minimum ratio of registered executors (registered executors / total expected executors)
+ to wait for before scheduling begins. Specified as a double between 0 and 1.
+ Regardless of whether the minimum ratio of executors has been reached,
+ the maximum amount of time it will wait before scheduling begins is controlled by config
+ <code>spark.scheduler.maxRegisteredExecutorsWaitingTime</code>
+ </td>
+</tr>
+<tr>
+ <td><code>spark.scheduler.maxRegisteredExecutorsWaitingTime</code></td>
+ <td>30000</td>
+ <td>
+ Maximum amount of time to wait for executors to register before scheduling begins
+ (in milliseconds).
+ </td>
+</tr>
</table>
#### Security