diff options
author | huangzhaowei <carlmartinmax@gmail.com> | 2015-07-01 23:01:44 -0700 |
---|---|---|
committer | Andrew Or <andrew@databricks.com> | 2015-07-01 23:01:44 -0700 |
commit | 646366b5d2f12e42f8e7287672ba29a8c918a17d (patch) | |
tree | dd7246f9802407aa165b72a672a08f7219d16ae4 /core | |
parent | 792fcd802c99a0aef2b67d54f0e6e58710e65956 (diff) | |
download | spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.tar.gz spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.tar.bz2 spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.zip |
[SPARK-8688] [YARN] Bug fix: disable the cache fs to gain the HDFS connection.
If `fs.hdfs.impl.disable.cache` was `false`(default), `FileSystem` will use the cached `DFSClient` which use old token.
[AMDelegationTokenRenewer](https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala#L196)
```scala
val credentials = UserGroupInformation.getCurrentUser.getCredentials
credentials.writeTokenStorageFile(tempTokenPath, discachedConfiguration)
```
Although the `credentials` had the new Token, but it still use the cached client and old token.
So It's better to set the `fs.hdfs.impl.disable.cache` as `true` to avoid token expired.
[Jira](https://issues.apache.org/jira/browse/SPARK-8688)
Author: huangzhaowei <carlmartinmax@gmail.com>
Closes #7069 from SaintBacchus/SPARK-8688 and squashes the following commits:
f94cd0b [huangzhaowei] modify function parameter
8fb9eb9 [huangzhaowei] explicit the comment
0cd55c9 [huangzhaowei] Rename function name to be an accurate one
cf776a1 [huangzhaowei] [SPARK-8688][YARN]Bug fix: disable the cache fs to gain the HDFS connection.
Diffstat (limited to 'core')
-rw-r--r-- | core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala | 13 |
1 files changed, 13 insertions, 0 deletions
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala index 7fa75ac8c2..6d14590a1d 100644 --- a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala +++ b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala @@ -334,6 +334,19 @@ class SparkHadoopUtil extends Logging { * Stop the thread that does the delegation token updates. */ private[spark] def stopExecutorDelegationTokenRenewer() {} + + /** + * Return a fresh Hadoop configuration, bypassing the HDFS cache mechanism. + * This is to prevent the DFSClient from using an old cached token to connect to the NameNode. + */ + private[spark] def getConfBypassingFSCache( + hadoopConf: Configuration, + scheme: String): Configuration = { + val newConf = new Configuration(hadoopConf) + val confKey = s"fs.${scheme}.impl.disable.cache" + newConf.setBoolean(confKey, true) + newConf + } } object SparkHadoopUtil { |