aboutsummaryrefslogtreecommitdiff
path: root/common
diff options
context:
space:
mode:
authorWenchen Fan <wenchen@databricks.com>2016-08-21 22:23:14 -0700
committerYin Huai <yhuai@databricks.com>2016-08-21 22:23:14 -0700
commitb2074b664a9c269c4103760d40c4a14e7aeb1e83 (patch)
tree58cf286848123d09fb9e29bc92a800b0fc91ef88 /common
parent91c2397684ab791572ac57ffb2a924ff058bb64f (diff)
downloadspark-b2074b664a9c269c4103760d40c4a14e7aeb1e83.tar.gz
spark-b2074b664a9c269c4103760d40c4a14e7aeb1e83.tar.bz2
spark-b2074b664a9c269c4103760d40c4a14e7aeb1e83.zip
[SPARK-16498][SQL] move hive hack for data source table into HiveExternalCatalog
## What changes were proposed in this pull request? Spark SQL doesn't have its own meta store yet, and use hive's currently. However, hive's meta store has some limitations(e.g. columns can't be too many, not case-preserving, bad decimal type support, etc.), so we have some hacks to successfully store data source table metadata into hive meta store, i.e. put all the information in table properties. This PR moves these hacks to `HiveExternalCatalog`, tries to isolate hive specific logic in one place. changes overview: 1. **before this PR**: we need to put metadata(schema, partition columns, etc.) of data source tables to table properties before saving it to external catalog, even the external catalog doesn't use hive metastore(e.g. `InMemoryCatalog`) **after this PR**: the table properties tricks are only in `HiveExternalCatalog`, the caller side doesn't need to take care of it anymore. 2. **before this PR**: because the table properties tricks are done outside of external catalog, so we also need to revert these tricks when we read the table metadata from external catalog and use it. e.g. in `DescribeTableCommand` we will read schema and partition columns from table properties. **after this PR**: The table metadata read from external catalog is exactly the same with what we saved to it. bonus: now we can create data source table using `SessionCatalog`, if schema is specified. breaks: `schemaStringLengthThreshold` is not configurable anymore. `hive.default.rcfile.serde` is not configurable anymore. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenchen@databricks.com> Closes #14155 from cloud-fan/catalog-table.
Diffstat (limited to 'common')
0 files changed, 0 insertions, 0 deletions