diff options
author | Sandy Ryza <sandy@cloudera.com> | 2014-03-29 14:41:36 -0700 |
---|---|---|
committer | Patrick Wendell <pwendell@gmail.com> | 2014-03-29 14:41:36 -0700 |
commit | 1617816090e7b20124a512a43860a21232ebf511 (patch) | |
tree | cb6e45d21cb59edd81ab3bc29b9e00ab034bb90d /docs/running-on-yarn.md | |
parent | 3738f24421d6f3bd10e5ef9ebfc10f702a5cb7ac (diff) | |
download | spark-1617816090e7b20124a512a43860a21232ebf511.tar.gz spark-1617816090e7b20124a512a43860a21232ebf511.tar.bz2 spark-1617816090e7b20124a512a43860a21232ebf511.zip |
SPARK-1126. spark-app preliminary
This is a starting version of the spark-app script for running compiled binaries against Spark. It still needs tests and some polish. The only testing I've done so far has been using it to launch jobs in yarn-standalone mode against a pseudo-distributed cluster.
This leaves out the changes required for launching python scripts. I think it might be best to save those for another JIRA/PR (while keeping to the design so that they won't require backwards-incompatible changes).
Author: Sandy Ryza <sandy@cloudera.com>
Closes #86 from sryza/sandy-spark-1126 and squashes the following commits:
d428d85 [Sandy Ryza] Commenting, doc, and import fixes from Patrick's comments
e7315c6 [Sandy Ryza] Fix failing tests
34de899 [Sandy Ryza] Change --more-jars to --jars and fix docs
299ddca [Sandy Ryza] Fix scalastyle
a94c627 [Sandy Ryza] Add newline at end of SparkSubmit
04bc4e2 [Sandy Ryza] SPARK-1126. spark-submit script
Diffstat (limited to 'docs/running-on-yarn.md')
-rw-r--r-- | docs/running-on-yarn.md | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index 2e9dec4856..d8657c4bc7 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -48,10 +48,12 @@ System Properties: Ensure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster. These configs are used to connect to the cluster, write to the dfs, and connect to the YARN ResourceManager. -There are two scheduler modes that can be used to launch Spark applications on YARN. In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In yarn-client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. +There are two deploy modes that can be used to launch Spark applications on YARN. In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In yarn-client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. Unlike in Spark standalone and Mesos mode, in which the master's address is specified in the "master" parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the master parameter is simply "yarn-client" or "yarn-cluster". +The spark-submit script described in the [cluster mode overview](cluster-overview.html) provides the most straightforward way to submit a compiled Spark application to YARN in either deploy mode. For info on the lower-level invocations it uses, read ahead. For running spark-shell against YARN, skip down to the yarn-client section. + ## Launching a Spark application with yarn-cluster mode. The command to launch the Spark application on the cluster is as follows: @@ -121,7 +123,7 @@ or MASTER=yarn-client ./bin/spark-shell -## Viewing logs +# Viewing logs In YARN terminology, executors and application masters run inside "containers". YARN has two modes for handling container logs after an application has completed. If log aggregation is turned on (with the yarn.log-aggregation-enable config), container logs are copied to HDFS and deleted on the local machine. These logs can be viewed from anywhere on the cluster with the "yarn logs" command. |