diff options
author | Herman van Hovell <hvanhovell@questtec.nl> | 2016-01-06 11:16:53 -0800 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2016-01-06 11:16:53 -0800 |
commit | ea489f14f11b2fdfb44c86634d2e2c2167b6ea18 (patch) | |
tree | 8b29d929015ed8337a8c485f7d88a81dbbfe0166 /python/pyspark/mllib/clustering.py | |
parent | 3aa3488225af12a77da3ba807906bc6a461ef11c (diff) | |
download | spark-ea489f14f11b2fdfb44c86634d2e2c2167b6ea18.tar.gz spark-ea489f14f11b2fdfb44c86634d2e2c2167b6ea18.tar.bz2 spark-ea489f14f11b2fdfb44c86634d2e2c2167b6ea18.zip |
[SPARK-12573][SPARK-12574][SQL] Move SQL Parser from Hive to Catalyst
This PR moves a major part of the new SQL parser to Catalyst. This is a prelude to start using this parser for all of our SQL parsing. The following key changes have been made:
The ANTLR Parser & Supporting classes have been moved to the Catalyst project. They are now part of the ```org.apache.spark.sql.catalyst.parser``` package. These classes contained quite a bit of code that was originally from the Hive project, I have added aknowledgements whenever this applied. All Hive dependencies have been factored out. I have also taken this chance to clean-up the ```ASTNode``` class, and to improve the error handling.
The HiveQl object that provides the functionality to convert an AST into a LogicalPlan has been refactored into three different classes, one for every SQL sub-project:
- ```CatalystQl```: This implements Query and Expression parsing functionality.
- ```SparkQl```: This is a subclass of CatalystQL and provides SQL/Core only functionality such as Explain and Describe.
- ```HiveQl```: This is a subclass of ```SparkQl``` and this adds Hive-only functionality to the parser such as Analyze, Drop, Views, CTAS & Transforms. This class still depends on Hive.
cc rxin
Author: Herman van Hovell <hvanhovell@questtec.nl>
Closes #10583 from hvanhovell/SPARK-12575.
Diffstat (limited to 'python/pyspark/mllib/clustering.py')
0 files changed, 0 insertions, 0 deletions