aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorAndrew Or <andrew@databricks.com>2016-02-21 15:00:24 -0800
committerReynold Xin <rxin@databricks.com>2016-02-21 15:00:24 -0800
commit6c3832b26e119626205732b8fd03c8f5ba986896 (patch)
treec23d83055b66647662414f4a5f835ec30efbe64f /docs
parent7eb83fefd19e137d80a23b5174b66b14831c291a (diff)
downloadspark-6c3832b26e119626205732b8fd03c8f5ba986896.tar.gz
spark-6c3832b26e119626205732b8fd03c8f5ba986896.tar.bz2
spark-6c3832b26e119626205732b8fd03c8f5ba986896.zip
[SPARK-13080][SQL] Implement new Catalog API using Hive
## What changes were proposed in this pull request? This is a step towards merging `SQLContext` and `HiveContext`. A new internal Catalog API was introduced in #10982 and extended in #11069. This patch introduces an implementation of this API using `HiveClient`, an existing interface to Hive. It also extends `HiveClient` with additional calls to Hive that are needed to complete the catalog implementation. *Where should I start reviewing?* The new catalog introduced is `HiveCatalog`. This class is relatively simple because it just calls `HiveClientImpl`, where most of the new logic is. I would not start with `HiveClient`, `HiveQl`, or `HiveMetastoreCatalog`, which are modified mainly because of a refactor. *Why is this patch so big?* I had to refactor HiveClient to remove an intermediate representation of databases, tables, partitions etc. After this refactor `CatalogTable` convert directly to and from `HiveTable` (etc.). Otherwise we would have to first convert `CatalogTable` to the intermediate representation and then convert that to HiveTable, which is messy. The new class hierarchy is as follows: ``` org.apache.spark.sql.catalyst.catalog.Catalog - org.apache.spark.sql.catalyst.catalog.InMemoryCatalog - org.apache.spark.sql.hive.HiveCatalog ``` Note that, as of this patch, none of these classes are currently used anywhere yet. This will come in the future before the Spark 2.0 release. ## How was the this patch tested? All existing unit tests, and HiveCatalogSuite that extends CatalogTestCases. Author: Andrew Or <andrew@databricks.com> Author: Reynold Xin <rxin@databricks.com> Closes #11293 from rxin/hive-catalog.
Diffstat (limited to 'docs')
0 files changed, 0 insertions, 0 deletions