diff options
author | Matei Zaharia <matei@eecs.berkeley.edu> | 2012-09-25 23:26:56 -0700 |
---|---|---|
committer | Matei Zaharia <matei@eecs.berkeley.edu> | 2012-09-25 23:26:56 -0700 |
commit | f1246cc7c18bd0c155f920f4dc593e88147a94e4 (patch) | |
tree | 7dd8f81eab5261f6f2e707b3b668b9a9cc7bdc50 /docs/running-on-yarn.md | |
parent | 051785c7e67b7ba0f2f0b5e078753d3f4f380961 (diff) | |
download | spark-f1246cc7c18bd0c155f920f4dc593e88147a94e4.tar.gz spark-f1246cc7c18bd0c155f920f4dc593e88147a94e4.tar.bz2 spark-f1246cc7c18bd0c155f920f4dc593e88147a94e4.zip |
Various enhancements to the programming guide and HTML/CSS
Diffstat (limited to 'docs/running-on-yarn.md')
-rw-r--r-- | docs/running-on-yarn.md | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index 3c0e54671b..7cd46da940 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -3,16 +3,16 @@ layout: global title: Launching Spark on YARN --- -Spark allows you to launch jobs on an existing [YARN](http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/YARN.html) cluster. +Spark allows you to launch jobs on an existing [YARN](http://hadoop.apache.org/docs/r2.0.1-alpha/hadoop-yarn/hadoop-yarn-site/YARN.html) cluster. -## Preparations +# Preparations - In order to distribute Spark within the cluster it must be packaged into a single JAR file. This can be done by running `sbt/sbt assembly` - Your application code must be packaged into a separate jar file. If you want to test out the YARN deployment mode, you can use the current spark examples. A `spark-examples_2.9.1-0.6.0-SNAPSHOT.jar` file can be generated by running `sbt/sbt package`. -## Launching Spark on YARN +# Launching Spark on YARN The command to launch the YARN Client is as follows: @@ -36,7 +36,7 @@ For example: The above starts a YARN Client programs which periodically polls the Application Master for status updates and displays them in the console. The client will exit once your application has finished running. -## Important Notes +# Important Notes - When your application instantiates a Spark context it must use a special "standalone" master url. This starts the scheduler without forcing it to connect to a cluster. A good way to handle this is to pass "standalone" as an argument to your program, as shown in the example above. - YARN does not support requesting container resources based on the number of cores. Thus the numbers of cores given via command line arguments cannot be guaranteed. |