aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorjerryshao <sshao@hortonworks.com>2016-05-27 11:31:25 -0700
committerMarcelo Vanzin <vanzin@cloudera.com>2016-05-27 11:31:25 -0700
commit1b98fa2e4382d3d8385cf1ac25d7fd3ae5650475 (patch)
tree5616db9f72d7e6a46f8832159c51fba57c34369a
parent623aae5907f4ba8b7079c21328e0c0b5fef7bb00 (diff)
downloadspark-1b98fa2e4382d3d8385cf1ac25d7fd3ae5650475.tar.gz
spark-1b98fa2e4382d3d8385cf1ac25d7fd3ae5650475.tar.bz2
spark-1b98fa2e4382d3d8385cf1ac25d7fd3ae5650475.zip
[YARN][DOC][MINOR] Remove several obsolete env variables and update the doc
## What changes were proposed in this pull request? Remove several obsolete env variables not supported for Spark on YARN now, also updates the docs to include several changes with 2.0. ## How was this patch tested? N/A CC vanzin tgravescs Author: jerryshao <sshao@hortonworks.com> Closes #13296 from jerryshao/yarn-doc.
-rwxr-xr-xconf/spark-env.sh.template4
-rw-r--r--docs/running-on-yarn.md4
2 files changed, 4 insertions, 4 deletions
diff --git a/conf/spark-env.sh.template b/conf/spark-env.sh.template
index a031cd6a72..9cffdc30c2 100755
--- a/conf/spark-env.sh.template
+++ b/conf/spark-env.sh.template
@@ -40,10 +40,6 @@
# - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: 1).
# - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G)
-# - SPARK_YARN_APP_NAME, The name of your application (Default: Spark)
-# - SPARK_YARN_QUEUE, The hadoop queue to use for allocation requests (Default: 'default')
-# - SPARK_YARN_DIST_FILES, Comma separated list of files to be distributed with the job.
-# - SPARK_YARN_DIST_ARCHIVES, Comma separated list of archives to be distributed with the job.
# Options for the daemons used in the standalone deploy mode
# - SPARK_MASTER_IP, to bind the master to a different IP address or hostname
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index f2fbe3ca56..9833806716 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -60,6 +60,8 @@ Running Spark on YARN requires a binary distribution of Spark which is built wit
Binary distributions can be downloaded from the [downloads page](http://spark.apache.org/downloads.html) of the project website.
To build Spark yourself, refer to [Building Spark](building-spark.html).
+To make Spark runtime jars accessible from YARN side, you can specify `spark.yarn.archive` or `spark.yarn.jars`. For details please refer to [Spark Properties](running-on-yarn.html#spark-properties). If neither `spark.yarn.archive` nor `spark.yarn.jars` is specified, Spark will create a zip file with all jars under `$SPARK_HOME/jars` and upload it to the distributed cache.
+
# Configuration
Most of the configs are the same for Spark on YARN as for other deployment modes. See the [configuration page](configuration.html) for more information on those. These are configs that are specific to Spark on YARN.
@@ -99,6 +101,8 @@ to the same log file).
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `log4j.appender.file_appender.File=${spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
+To use a custom metrics.properties for the application master and executors, update the `$SPARK_CONF_DIR/metrics.properties` file. It will automatically be uploaded with other configurations, so you don't need to specify it manually with `--files`.
+
#### Spark Properties
<table class="table">