From 093291cf9b8729c0bd057cf67aed840b11f8c94a Mon Sep 17 00:00:00 2001 From: Andrew Date: Wed, 27 Jan 2016 09:31:44 +0000 Subject: [SPARK-1680][DOCS] Explain environment variables for running on YARN in cluster mode JIRA 1680 added a property called spark.yarn.appMasterEnv. This PR draws users' attention to this special case by adding an explanation in configuration.html#environment-variables Author: Andrew Closes #10869 from weineran/branch-yarn-docs. --- docs/configuration.md | 2 ++ 1 file changed, 2 insertions(+) (limited to 'docs') diff --git a/docs/configuration.md b/docs/configuration.md index d2a2f10524..74a8fb5d35 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1643,6 +1643,8 @@ to use on each machine and maximum memory. Since `spark-env.sh` is a shell script, some of these can be set programmatically -- for example, you might compute `SPARK_LOCAL_IP` by looking up the IP of a specific network interface. +Note: When running Spark on YARN in `cluster` mode, environment variables need to be set using the `spark.yarn.appMasterEnv.[EnvironmentVariableName]` property in your `conf/spark-defaults.conf` file. Environment variables that are set in `spark-env.sh` will not be reflected in the YARN Application Master process in `cluster` mode. See the [YARN-related Spark Properties](running-on-yarn.html#spark-properties) for more information. + # Configuring Logging Spark uses [log4j](http://logging.apache.org/log4j/) for logging. You can configure it by adding a -- cgit v1.2.3