aboutsummaryrefslogtreecommitdiff
path: root/docs/running-on-yarn.md
diff options
context:
space:
mode:
authorThomas Graves <tgraves@apache.org>2014-08-05 12:48:26 -0500
committerThomas Graves <tgraves@apache.org>2014-08-05 12:48:26 -0500
commit2c0f705e26ca3dfc43a1e9a0722c0e57f67c970a (patch)
treed76f5c99e1e6c7eebfdf09009b87f07613c5ab2a /docs/running-on-yarn.md
parente87075df977a539e4a1684045a7bd66c36285174 (diff)
downloadspark-2c0f705e26ca3dfc43a1e9a0722c0e57f67c970a.tar.gz
spark-2c0f705e26ca3dfc43a1e9a0722c0e57f67c970a.tar.bz2
spark-2c0f705e26ca3dfc43a1e9a0722c0e57f67c970a.zip
SPARK-1528 - spark on yarn, add support for accessing remote HDFS
Add a config (spark.yarn.access.namenodes) to allow applications running on yarn to access other secure HDFS cluster. User just specifies the namenodes of the other clusters and we get Tokens for those and ship them with the spark application. Author: Thomas Graves <tgraves@apache.org> Closes #1159 from tgravescs/spark-1528 and squashes the following commits: ddbcd16 [Thomas Graves] review comments 0ac8501 [Thomas Graves] SPARK-1528 - add support for accessing remote HDFS
Diffstat (limited to 'docs/running-on-yarn.md')
-rw-r--r--docs/running-on-yarn.md7
1 files changed, 7 insertions, 0 deletions
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index 0362f5a223..573930dbf4 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -106,6 +106,13 @@ Most of the configs are the same for Spark on YARN as for other deployment modes
set this configuration to "hdfs:///some/path".
</td>
</tr>
+<tr>
+ <td><code>spark.yarn.access.namenodes</code></td>
+ <td>(none)</td>
+ <td>
+ A list of secure HDFS namenodes your Spark application is going to access. For example, `spark.yarn.access.namenodes=hdfs://nn1.com:8032,hdfs://nn2.com:8032`. The Spark application must have acess to the namenodes listed and Kerberos must be properly configured to be able to access them (either in the same realm or in a trusted realm). Spark acquires security tokens for each of the namenodes so that the Spark application can access those remote HDFS clusters.
+ </td>
+</tr>
</table>
# Launching Spark on YARN