diff options
author | Joseph Batchik <joseph.batchik@cloudera.com> | 2015-08-08 11:03:01 -0700 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2015-08-08 11:03:01 -0700 |
commit | a3aec918bed22f8e33cf91dc0d6e712e6653c7d2 (patch) | |
tree | 6c8bf644c083f7e7f0ede49873debb45d805cb5d /sql/hive/src/main | |
parent | 23695f1d2d7ef9f3ea92cebcd96b1cf0e8904eb4 (diff) | |
download | spark-a3aec918bed22f8e33cf91dc0d6e712e6653c7d2.tar.gz spark-a3aec918bed22f8e33cf91dc0d6e712e6653c7d2.tar.bz2 spark-a3aec918bed22f8e33cf91dc0d6e712e6653c7d2.zip |
[SPARK-9486][SQL] Add data source aliasing for external packages
Users currently have to provide the full class name for external data sources, like:
`sqlContext.read.format("com.databricks.spark.avro").load(path)`
This allows external data source packages to register themselves using a Service Loader so that they can add custom alias like:
`sqlContext.read.format("avro").load(path)`
This makes it so that using external data source packages uses the same format as the internal data sources like parquet, json, etc.
Author: Joseph Batchik <joseph.batchik@cloudera.com>
Author: Joseph Batchik <josephbatchik@gmail.com>
Closes #7802 from JDrit/service_loader and squashes the following commits:
49a01ec [Joseph Batchik] fixed a couple of format / error bugs
e5e93b2 [Joseph Batchik] modified rat file to only excluded added services
72b349a [Joseph Batchik] fixed error with orc data source actually
9f93ea7 [Joseph Batchik] fixed error with orc data source
87b7f1c [Joseph Batchik] fixed typo
101cd22 [Joseph Batchik] removing unneeded changes
8f3cf43 [Joseph Batchik] merged in changes
b63d337 [Joseph Batchik] merged in master
95ae030 [Joseph Batchik] changed the new trait to be used as a mixin for data source to register themselves
74db85e [Joseph Batchik] reformatted class loader
ac2270d [Joseph Batchik] removing some added test
a6926db [Joseph Batchik] added test cases for data source loader
208a2a8 [Joseph Batchik] changes to do error catching if there are multiple data sources
946186e [Joseph Batchik] started working on service loader
Diffstat (limited to 'sql/hive/src/main')
-rw-r--r-- | sql/hive/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister | 1 | ||||
-rw-r--r-- | sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala | 5 |
2 files changed, 5 insertions, 1 deletions
diff --git a/sql/hive/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister b/sql/hive/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister new file mode 100644 index 0000000000..4a774fbf1f --- /dev/null +++ b/sql/hive/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister @@ -0,0 +1 @@ +org.apache.spark.sql.hive.orc.DefaultSource diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala index 7c8704b47f..0c344c63fd 100644 --- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala +++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala @@ -47,7 +47,10 @@ import org.apache.spark.util.SerializableConfiguration /* Implicit conversions */ import scala.collection.JavaConversions._ -private[sql] class DefaultSource extends HadoopFsRelationProvider { +private[sql] class DefaultSource extends HadoopFsRelationProvider with DataSourceRegister { + + def format(): String = "orc" + def createRelation( sqlContext: SQLContext, paths: Array[String], |