aboutsummaryrefslogtreecommitdiff
path: root/docs/submitting-applications.md
diff options
context:
space:
mode:
authorSean Owen <sowen@cloudera.com>2015-10-04 09:31:52 +0100
committerSean Owen <sowen@cloudera.com>2015-10-04 09:31:52 +0100
commit82bbc2a5f2c74604db59060cb5e462a057398ddd (patch)
tree81d81e84cefa3eed3019fd282aff4317803503ee /docs/submitting-applications.md
parent721e8b5f35b230ff426c1757a9bdc1399fb19afa (diff)
downloadspark-82bbc2a5f2c74604db59060cb5e462a057398ddd.tar.gz
spark-82bbc2a5f2c74604db59060cb5e462a057398ddd.tar.bz2
spark-82bbc2a5f2c74604db59060cb5e462a057398ddd.zip
[SPARK-9570] [DOCS] Consistent recommendation for submitting spark apps to YARN, -master yarn --deploy-mode x vs -master yarn-x'.
Recommend `--master yarn --deploy-mode {cluster,client}` consistently in docs. Follow-on to https://github.com/apache/spark/pull/8385 CC nssalian Author: Sean Owen <sowen@cloudera.com> Closes #8968 from srowen/SPARK-9570.
Diffstat (limited to 'docs/submitting-applications.md')
-rw-r--r--docs/submitting-applications.md25
1 files changed, 15 insertions, 10 deletions
diff --git a/docs/submitting-applications.md b/docs/submitting-applications.md
index 915be0f479..ac2a14eb56 100644
--- a/docs/submitting-applications.md
+++ b/docs/submitting-applications.md
@@ -103,7 +103,8 @@ run it with `--help`. Here are a few examples of common options:
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
- --master yarn-cluster \ # can also be yarn-client for client mode
+ --master yarn \
+ --deploy-mode cluster \ # can be client for client mode
--executor-memory 20G \
--num-executors 50 \
/path/to/examples.jar \
@@ -122,21 +123,25 @@ The master URL passed to Spark can be in one of the following formats:
<table class="table">
<tr><th>Master URL</th><th>Meaning</th></tr>
-<tr><td> local </td><td> Run Spark locally with one worker thread (i.e. no parallelism at all). </td></tr>
-<tr><td> local[K] </td><td> Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine). </td></tr>
-<tr><td> local[*] </td><td> Run Spark locally with as many worker threads as logical cores on your machine.</td></tr>
-<tr><td> spark://HOST:PORT </td><td> Connect to the given <a href="spark-standalone.html">Spark standalone
+<tr><td> <code>local</code> </td><td> Run Spark locally with one worker thread (i.e. no parallelism at all). </td></tr>
+<tr><td> <code>local[K]</code> </td><td> Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine). </td></tr>
+<tr><td> <code>local[*]</code> </td><td> Run Spark locally with as many worker threads as logical cores on your machine.</td></tr>
+<tr><td> <code>spark://HOST:PORT</code> </td><td> Connect to the given <a href="spark-standalone.html">Spark standalone
cluster</a> master. The port must be whichever one your master is configured to use, which is 7077 by default.
</td></tr>
-<tr><td> mesos://HOST:PORT </td><td> Connect to the given <a href="running-on-mesos.html">Mesos</a> cluster.
+<tr><td> <code>mesos://HOST:PORT</code> </td><td> Connect to the given <a href="running-on-mesos.html">Mesos</a> cluster.
The port must be whichever one your is configured to use, which is 5050 by default.
Or, for a Mesos cluster using ZooKeeper, use <code>mesos://zk://...</code>.
</td></tr>
-<tr><td> yarn-client </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
-client mode. The cluster location will be found based on the HADOOP_CONF_DIR or YARN_CONF_DIR variable.
+<tr><td> <code>yarn</code> </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
+ <code>client</code> or <code>cluster</code> mode depending on the value of <code>--deploy-mode</code>.
+ The cluster location will be found based on the <code>HADOOP_CONF_DIR</code> or <code>YARN_CONF_DIR</code> variable.
</td></tr>
-<tr><td> yarn-cluster </td><td> Connect to a <a href="running-on-yarn.html"> YARN </a> cluster in
-cluster mode. The cluster location will be found based on the HADOOP_CONF_DIR or YARN_CONF_DIR variable.
+<tr><td> <code>yarn-client</code> </td><td> Equivalent to <code>yarn</code> with <code>--deploy-mode client</code>,
+ which is preferred to `yarn-client`
+</td></tr>
+<tr><td> <code>yarn-cluster</code> </td><td> Equivalent to <code>yarn</code> with <code>--deploy-mode cluster</code>,
+ which is preferred to `yarn-cluster`
</td></tr>
</table>