aboutsummaryrefslogtreecommitdiff
path: root/project
diff options
context:
space:
mode:
authorjerryshao <sshao@hortonworks.com>2016-08-10 15:39:30 -0700
committerMarcelo Vanzin <vanzin@cloudera.com>2016-08-10 15:39:30 -0700
commitab648c0004cfb20d53554ab333dd2d198cb94ffa (patch)
tree74fa18e0a21caedaca6eda3557d60c9bd3af07b0 /project
parentbd2c12fb4994785d5becce541aee9ba73fef1c4c (diff)
downloadspark-ab648c0004cfb20d53554ab333dd2d198cb94ffa.tar.gz
spark-ab648c0004cfb20d53554ab333dd2d198cb94ffa.tar.bz2
spark-ab648c0004cfb20d53554ab333dd2d198cb94ffa.zip
[SPARK-14743][YARN] Add a configurable credential manager for Spark running on YARN
## What changes were proposed in this pull request? Add a configurable token manager for Spark on running on yarn. ### Current Problems ### 1. Supported token provider is hard-coded, currently only hdfs, hbase and hive are supported and it is impossible for user to add new token provider without code changes. 2. Also this problem exits in timely token renewer and updater. ### Changes In This Proposal ### In this proposal, to address the problems mentioned above and make the current code more cleaner and easier to understand, mainly has 3 changes: 1. Abstract a `ServiceTokenProvider` as well as `ServiceTokenRenewable` interface for token provider. Each service wants to communicate with Spark through token way needs to implement this interface. 2. Provide a `ConfigurableTokenManager` to manage all the register token providers, also token renewer and updater. Also this class offers the API for other modules to obtain tokens, get renewal interval and so on. 3. Implement 3 built-in token providers `HDFSTokenProvider`, `HiveTokenProvider` and `HBaseTokenProvider` to keep the same semantics as supported today. Whether to load in these built-in token providers is controlled by configuration "spark.yarn.security.tokens.${service}.enabled", by default for all the built-in token providers are loaded. ### Behavior Changes ### For the end user there's no behavior change, we still use the same configuration `spark.yarn.security.tokens.${service}.enabled` to decide which token provider is enabled (hbase or hive). For user implemented token provider (assume the name of token provider is "test") needs to add into this class should have two configurations: 1. `spark.yarn.security.tokens.test.enabled` to true 2. `spark.yarn.security.tokens.test.class` to the full qualified class name. So we still keep the same semantics as current code while add one new configuration. ### Current Status ### - [x] token provider interface and management framework. - [x] implement built-in token providers (hdfs, hbase, hive). - [x] Coverage of unit test. - [x] Integrated test with security cluster. ## How was this patch tested? Unit test and integrated test. Please suggest and review, any comment is greatly appreciated. Author: jerryshao <sshao@hortonworks.com> Closes #14065 from jerryshao/SPARK-16342.
Diffstat (limited to 'project')
-rw-r--r--project/MimaExcludes.scala5
1 files changed, 4 insertions, 1 deletions
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index a201d7f838..688218f6f4 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -784,7 +784,10 @@ object MimaExcludes {
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.SQLContext.jdbc"),
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.SQLContext.parquetFile"),
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.SQLContext.applySchema")
- )
+ ) ++ Seq(
+ // [SPARK-14743] Improve delegation token handling in secure cluster
+ ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.deploy.SparkHadoopUtil.getTimeFromNowToRenewal")
+ )
}
def excludes(version: String) = version match {