[SPARK-6662][YARN] Allow variable substitution in spark.yarn.historyServer.address

In Spark on YARN, explicit hostname and port number need to be set for "spark.yarn.historyServer.address" in SparkConf to make the HISTORY link. If the history server address is known and static, this is usually not a problem. But in cloud, that is usually not true. Particularly in EMR, the history server always runs on the same node as with RM. So I could simply set it to ${yarn.resourcemanager.hostname}:18080 if variable substitution is allowed. In fact, Hadoop configuration already implements variable substitution, so if this property is read via YarnConf, this can be easily achievable. Author: Cheolsoo Park <cheolsoop@netflix.com> Closes #5321 from piaozhexiu/SPARK-6662 and squashes the following commits: e37de75 [Cheolsoo Park] Preserve the space between the Hadoop and Spark imports 79757c6 [Cheolsoo Park] Incorporate review comments 10e2917 [Cheolsoo Park] Add helper function that substitutes hadoop vars to SparkHadoopUtil 589b52c [Cheolsoo Park] Revert "Allow variable substitution for spark.yarn. properties" ff9c35d [Cheolsoo Park] Allow variable substitution for spark.yarn. properties
author: Cheolsoo Park <cheolsoop@netflix.com> 2015-04-13 13:45:10 -0500
committer: Thomas Graves <tgraves@apache.org> 2015-04-13 13:45:10 -0500
commit: 6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f (patch)
tree: 0c1c1fb8b27573a8edfe3c2afe4a111948bd88dc /docs/running-on-yarn.md
parent: c5b0b296b842926b5c07531a5affe8984bc799c5 (diff)
download: spark-6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f.tar.gz
spark-6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f.tar.bz2
spark-6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f.zip
1 files changed, 2 insertions, 1 deletions
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index b7e68d4f71..ed5bb263a5 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -87,7 +87,8 @@ Most of the configs are the same for Spark on YARN as for other deployment modes
   <td><code>spark.yarn.historyServer.address</code></td>
   <td>(none)</td>
   <td>
-    The address of the Spark history server (i.e. host.com:18080). The address should not contain a scheme (http://). Defaults to not being set since the history server is an optional service. This address is given to the YARN ResourceManager when the Spark application finishes to link the application from the ResourceManager UI to the Spark history server UI.
+    The address of the Spark history server (i.e. host.com:18080). The address should not contain a scheme (http://). Defaults to not being set since the history server is an optional service. This address is given to the YARN ResourceManager when the Spark application finishes to link the application from the ResourceManager UI to the Spark history server UI. 
+    For this property, YARN properties can be used as variables, and these are substituted by Spark at runtime. For eg, if the Spark history server runs on the same node as the YARN ResourceManager, it can be set to `${hadoopconf-yarn.resourcemanager.hostname}:18080`. 
   </td>
 </tr>
 <tr>
author	Cheolsoo Park <cheolsoop@netflix.com>	2015-04-13 13:45:10 -0500
committer	Thomas Graves <tgraves@apache.org>	2015-04-13 13:45:10 -0500
commit	6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f (patch)
tree	0c1c1fb8b27573a8edfe3c2afe4a111948bd88dc /docs/running-on-yarn.md
parent	c5b0b296b842926b5c07531a5affe8984bc799c5 (diff)
download	spark-6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f.tar.gz spark-6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f.tar.bz2 spark-6cc5b3ed3c0c729f97956fa017d8eb7d6b43f90f.zip