From 35ed09f1d130c16d79b4840f7f87827359c7cb10 Mon Sep 17 00:00:00 2001 From: Jey Kottalam Date: Wed, 4 Sep 2013 11:52:16 -0700 Subject: Clarify YARN example --- docs/running-on-yarn.md | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) (limited to 'docs') diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index 93421efcbc..c611db0af4 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -42,7 +42,7 @@ This would be used to connect to the cluster, write to the dfs and submit jobs t The command to launch the YARN Client is as follows: - SPARK_JAR= ./spark-class org.apache.spark.deploy.yarn.Client \ + SPARK_JAR= ./spark-class org.apache.spark.deploy.yarn.Client \ --jar \ --class \ --args \ @@ -54,14 +54,27 @@ The command to launch the YARN Client is as follows: For example: - SPARK_JAR=./yarn/target/spark-yarn-assembly-{{site.SPARK_VERSION}}.jar ./spark-class org.apache.spark.deploy.yarn.Client \ - --jar examples/target/scala-{{site.SCALA_VERSION}}/spark-examples_{{site.SCALA_VERSION}}-{{site.SPARK_VERSION}}.jar \ - --class org.apache.spark.examples.SparkPi \ - --args yarn-standalone \ - --num-workers 3 \ - --master-memory 4g \ - --worker-memory 2g \ - --worker-cores 1 + # Build the Spark assembly JAR and the Spark examples JAR + $ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_YARN=true ./sbt/sbt assembly + + # Configure logging + $ cp conf/log4j.properties.template conf/log4j.properties + + # Submit Spark's ApplicationMaster to YARN's ResourceManager, and instruct Spark to run the SparkPi example + $ SPARK_JAR=./assembly/target/scala-{{site.SCALA_VERSION}}/spark-assembly-{{site.SPARK_VERSION}}-hadoop2.0.5-alpha.jar \ + ./spark-class org.apache.spark.deploy.yarn.Client \ + --jar examples/target/scala-{{site.SCALA_VERSION}}/spark-examples-assembly-{{site.SPARK_VERSION}}.jar \ + --class org.apache.spark.examples.SparkPi \ + --args yarn-standalone \ + --num-workers 3 \ + --master-memory 4g \ + --worker-memory 2g \ + --worker-cores 1 + + # Examine the output (replace $YARN_APP_ID in the following with the "application identifier" output by the previous command) + # (Note: YARN_APP_LOGS_DIR is usually /tmp/logs or $HADOOP_HOME/logs/userlogs depending on the Hadoop version.) + $ cat $YARN_APP_LOGS_DIR/$YARN_APP_ID/container*_000001/stdout + Pi is roughly 3.13794 The above starts a YARN Client programs which periodically polls the Application Master for status updates and displays them in the console. The client will exit once your application has finished running. -- cgit v1.2.3 From e653a9d8914059fc8430f1d0d4ee9296d8ed9651 Mon Sep 17 00:00:00 2001 From: Patrick Wendell Date: Fri, 6 Sep 2013 12:15:49 -0700 Subject: Provide docs to describe running on CDH/HDP cluster. This doc consolidates information relevant to CDH/HDP users in a single place. --- docs/_layouts/global.html | 1 + docs/cdh-hdp.md | 66 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+) create mode 100644 docs/cdh-hdp.md (limited to 'docs') diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html index 84749fda4e..3a3b8dce37 100755 --- a/docs/_layouts/global.html +++ b/docs/_layouts/global.html @@ -98,6 +98,7 @@