[SPARK-8688] [YARN] Bug fix: disable the cache fs to gain the HDFS connection.

If `fs.hdfs.impl.disable.cache` was `false`(default), `FileSystem` will use the cached `DFSClient` which use old token. [AMDelegationTokenRenewer](https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala#L196) ```scala val credentials = UserGroupInformation.getCurrentUser.getCredentials credentials.writeTokenStorageFile(tempTokenPath, discachedConfiguration) ``` Although the `credentials` had the new Token, but it still use the cached client and old token. So It's better to set the `fs.hdfs.impl.disable.cache` as `true` to avoid token expired. [Jira](https://issues.apache.org/jira/browse/SPARK-8688) Author: huangzhaowei <carlmartinmax@gmail.com> Closes #7069 from SaintBacchus/SPARK-8688 and squashes the following commits: f94cd0b [huangzhaowei] modify function parameter 8fb9eb9 [huangzhaowei] explicit the comment 0cd55c9 [huangzhaowei] Rename function name to be an accurate one cf776a1 [huangzhaowei] [SPARK-8688][YARN]Bug fix: disable the cache fs to gain the HDFS connection.
author: huangzhaowei <carlmartinmax@gmail.com> 2015-07-01 23:01:44 -0700
committer: Andrew Or <andrew@databricks.com> 2015-07-01 23:01:44 -0700
commit: 646366b5d2f12e42f8e7287672ba29a8c918a17d (patch)
tree: dd7246f9802407aa165b72a672a08f7219d16ae4 /core
parent: 792fcd802c99a0aef2b67d54f0e6e58710e65956 (diff)
download: spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.tar.gz
spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.tar.bz2
spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.zip
1 files changed, 13 insertions, 0 deletions
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
index 7fa75ac8c2..6d14590a1d 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
@@ -334,6 +334,19 @@ class SparkHadoopUtil extends Logging {
    * Stop the thread that does the delegation token updates.
    */
   private[spark] def stopExecutorDelegationTokenRenewer() {}
+
+  /**
+   * Return a fresh Hadoop configuration, bypassing the HDFS cache mechanism.
+   * This is to prevent the DFSClient from using an old cached token to connect to the NameNode.
+   */
+  private[spark] def getConfBypassingFSCache(
+      hadoopConf: Configuration,
+      scheme: String): Configuration = {
+    val newConf = new Configuration(hadoopConf)
+    val confKey = s"fs.${scheme}.impl.disable.cache"
+    newConf.setBoolean(confKey, true)
+    newConf
+  }
 }
 
 object SparkHadoopUtil {
author	huangzhaowei <carlmartinmax@gmail.com>	2015-07-01 23:01:44 -0700
committer	Andrew Or <andrew@databricks.com>	2015-07-01 23:01:44 -0700
commit	646366b5d2f12e42f8e7287672ba29a8c918a17d (patch)
tree	dd7246f9802407aa165b72a672a08f7219d16ae4 /core
parent	792fcd802c99a0aef2b67d54f0e6e58710e65956 (diff)
download	spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.tar.gz spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.tar.bz2 spark-646366b5d2f12e42f8e7287672ba29a8c918a17d.zip