diff options
author | Matei Zaharia <matei@databricks.com> | 2014-01-07 14:35:52 -0500 |
---|---|---|
committer | Matei Zaharia <matei@databricks.com> | 2014-01-07 14:35:52 -0500 |
commit | d8bcc8e9a095c1b20dd7a17b6535800d39bff80e (patch) | |
tree | f3f5a1368a43b765b541be706921903cc6ac8da0 /docs/configuration.md | |
parent | 15d953450167c4ec45c9d0a2c7ab8ee71be2e576 (diff) | |
download | spark-d8bcc8e9a095c1b20dd7a17b6535800d39bff80e.tar.gz spark-d8bcc8e9a095c1b20dd7a17b6535800d39bff80e.tar.bz2 spark-d8bcc8e9a095c1b20dd7a17b6535800d39bff80e.zip |
Add way to limit default # of cores used by applications on standalone mode
Also documents the spark.deploy.spreadOut option.
Diffstat (limited to 'docs/configuration.md')
-rw-r--r-- | docs/configuration.md | 33 |
1 files changed, 29 insertions, 4 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index 1d36ecb9c1..52ed59be30 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -77,13 +77,14 @@ there are at least five properties that you will commonly want to control: </tr> <tr> <td>spark.cores.max</td> - <td>(infinite)</td> + <td>(not set)</td> <td> When running on a <a href="spark-standalone.html">standalone deploy cluster</a> or a <a href="running-on-mesos.html#mesos-run-modes">Mesos cluster in "coarse-grained" sharing mode</a>, the maximum amount of CPU cores to request for the application from - across the cluster (not from each machine). The default will use all available cores - offered by the cluster manager. + across the cluster (not from each machine). If not set, the default will be + <code>spark.deploy.defaultCores</code> on Spark's standalone cluster manager, or + infinite (all available cores) on Mesos. </td> </tr> </table> @@ -404,12 +405,36 @@ Apart from these, the following properties are also available, and may be useful </td> </tr> <tr> - <td>spark.log-conf</td> + <td>spark.logConf</td> <td>false</td> <td> Log the supplied SparkConf as INFO at start of spark context. </td> </tr> +<tr> + <td>spark.deploy.spreadOut</td> + <td>true</td> + <td> + Whether the standalone cluster manager should spread applications out across nodes or try + to consolidate them onto as few nodes as possible. Spreading out is usually better for + data locality in HDFS, but consolidating is more efficient for compute-intensive workloads. <br/> + <b>Note:</b> this setting needs to be configured in the cluster master, not in individual + applications; you can set it through <code>SPARK_JAVA_OPTS</code> in <code>spark-env.sh</code>. + </td> +</tr> +<tr> + <td>spark.deploy.defaultCores</td> + <td>(infinite)</td> + <td> + Default number of cores to give to applications in Spark's standalone mode if they don't + set <code>spark.cores.max</code>. If not set, applications always get all available + cores unless they configure <code>spark.cores.max</code> themselves. + Set this lower on a shared cluster to prevent users from grabbing + the whole cluster by default. <br/> + <b>Note:</b> this setting needs to be configured in the cluster master, not in individual + applications; you can set it through <code>SPARK_JAVA_OPTS</code> in <code>spark-env.sh</code>. + </td> +</tr> </table> ## Viewing Spark Properties |