[SPARK-6907] [SQL] Isolated client for HiveMetastore

This PR adds initial support for loading multiple versions of Hive in a single JVM and provides a common interface for extracting metadata from the `HiveMetastoreClient` for a given version. This is accomplished by creating an isolated `ClassLoader` that operates according to the following rules: - __Shared Classes__: Java, Scala, logging, and Spark classes are delegated to `baseClassLoader` allowing the results of calls to the `ClientInterface` to be visible externally. - __Hive Classes__: new instances are loaded from `execJars`. These classes are not accessible externally due to their custom loading. - __Barrier Classes__: Classes such as `ClientWrapper` are defined in Spark but must link to a specific version of Hive. As a result, the bytecode is acquired from the Spark `ClassLoader` but a new copy is created for each instance of `IsolatedClientLoader`. This new instance is able to see a specific version of hive without using reflection where ever hive is consistent across versions. Since this is a unique instance, it is not visible externally other than as a generic `ClientInterface`, unless `isolationOn` is set to `false`. In addition to the unit tests, I have also tested this locally against mysql instances of the Hive Metastore. I've also successfully ported Spark SQL to run with this client, but due to the size of the changes, that will come in a follow-up PR. By default, Hive jars are currently downloaded from Maven automatically for a given version to ease packaging and testing. However, there is also support for specifying their location manually for deployments without internet. Author: Michael Armbrust <michael@databricks.com> Closes #5851 from marmbrus/isolatedClient and squashes the following commits: c72f6ac [Michael Armbrust] rxins comments 1e271fa [Michael Armbrust] [SPARK-6907][SQL] Isolated client for HiveMetastore
author: Michael Armbrust <michael@databricks.com> 2015-05-03 13:12:50 -0700
committer: Michael Armbrust <michael@databricks.com> 2015-05-03 13:12:50 -0700
commit: daa70bf135f23381f5f410aa95a1c0e5a2888568 (patch)
tree: f8abe90d96ca3ee31c6e4fa3939ba56fcfbf1c5f /core
parent: f4af92550cb90e47a12d4625fa615dd2b1587d42 (diff)
download: spark-daa70bf135f23381f5f410aa95a1c0e5a2888568.tar.gz
spark-daa70bf135f23381f5f410aa95a1c0e5a2888568.tar.bz2
spark-daa70bf135f23381f5f410aa95a1c0e5a2888568.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index 42b5d41b7b..8a0327984e 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -701,7 +701,7 @@ object SparkSubmit {
 }
 
 /** Provides utility functions to be used inside SparkSubmit. */
-private[deploy] object SparkSubmitUtils {
+private[spark] object SparkSubmitUtils {
 
   // Exposed for testing
   var printStream = SparkSubmit.printStream
author	Michael Armbrust <michael@databricks.com>	2015-05-03 13:12:50 -0700
committer	Michael Armbrust <michael@databricks.com>	2015-05-03 13:12:50 -0700
commit	daa70bf135f23381f5f410aa95a1c0e5a2888568 (patch)
tree	f8abe90d96ca3ee31c6e4fa3939ba56fcfbf1c5f /core
parent	f4af92550cb90e47a12d4625fa615dd2b1587d42 (diff)
download	spark-daa70bf135f23381f5f410aa95a1c0e5a2888568.tar.gz spark-daa70bf135f23381f5f410aa95a1c0e5a2888568.tar.bz2 spark-daa70bf135f23381f5f410aa95a1c0e5a2888568.zip