diff options
author | Cheng Lian <lian@databricks.com> | 2016-04-27 13:55:07 -0700 |
---|---|---|
committer | Yin Huai <yhuai@databricks.com> | 2016-04-27 13:55:13 -0700 |
commit | 24bea000476cdd0b43be5160a76bc5b170ef0b42 (patch) | |
tree | 5336028911f2db913d333bbfbf17b54e1b843f5c /sql/catalyst/src | |
parent | f405de87c878c49b17acb2c874be1084465384e9 (diff) | |
download | spark-24bea000476cdd0b43be5160a76bc5b170ef0b42.tar.gz spark-24bea000476cdd0b43be5160a76bc5b170ef0b42.tar.bz2 spark-24bea000476cdd0b43be5160a76bc5b170ef0b42.zip |
[SPARK-14954] [SQL] Add PARTITION BY and BUCKET BY clause for data source CTAS syntax
Currently, we can only create persisted partitioned and/or bucketed data source tables using the Dataset API but not using SQL DDL. This PR implements the following syntax to add partitioning and bucketing support to the SQL DDL:
```
CREATE TABLE <table-name>
USING <provider> [OPTIONS (<key1> <value1>, <key2> <value2>, ...)]
[PARTITIONED BY (col1, col2, ...)]
[CLUSTERED BY (col1, col2, ...) [SORTED BY (col1, col2, ...)] INTO <n> BUCKETS]
AS SELECT ...
```
Test cases are added in `MetastoreDataSourcesSuite` to check the newly added syntax.
Author: Cheng Lian <lian@databricks.com>
Author: Yin Huai <yhuai@databricks.com>
Closes #12734 from liancheng/spark-14954.
Diffstat (limited to 'sql/catalyst/src')
-rw-r--r-- | sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 index 6e04f6eb80..c356f0c3f1 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 @@ -47,7 +47,9 @@ statement | createTableHeader ('(' colTypeList ')')? tableProvider (OPTIONS tablePropertyList)? #createTableUsing | createTableHeader tableProvider - (OPTIONS tablePropertyList)? AS? query #createTableUsing + (OPTIONS tablePropertyList)? + (PARTITIONED BY partitionColumnNames=identifierList)? + bucketSpec? AS? query #createTableUsing | createTableHeader ('(' columns=colTypeList ')')? (COMMENT STRING)? (PARTITIONED BY '(' partitionColumns=colTypeList ')')? |