[SPARK-4553] [SPARK-5767] [SQL] Wires Parquet data source with the newly introduced write support for data source API - spark

diff options

author	Cheng Lian <lian@databricks.com>	2015-02-16 01:38:31 -0800
committer	Cheng Lian <lian@databricks.com>	2015-02-16 01:38:31 -0800
commit	3ce58cf9c0ffe8b867ca79b404fe3fa291cf0e56 (patch)
tree	a583c820c1cecd46fb021323d88ac3e50af01b98 /streaming
parent	199a9e80275ac70582ea32f0f2f5a0a15b168785 (diff)
download	spark-3ce58cf9c0ffe8b867ca79b404fe3fa291cf0e56.tar.gz spark-3ce58cf9c0ffe8b867ca79b404fe3fa291cf0e56.tar.bz2 spark-3ce58cf9c0ffe8b867ca79b404fe3fa291cf0e56.zip

[SPARK-4553] [SPARK-5767] [SQL] Wires Parquet data source with the newly introduced write support for data source API

This PR migrates the Parquet data source to the new data source write support API. Now users can also overwriting and appending to existing tables. Notice that inserting into partitioned tables is not supported yet. When Parquet data source is enabled, insertion to Hive Metastore Parquet tables is also fullfilled by the Parquet data source. This is done by the newly introduced `HiveMetastoreCatalog.ParquetConversions` rule, which is a "proper" implementation of the original hacky `HiveStrategies.ParquetConversion`. The latter is still preserved, and can be removed together with the old Parquet support in the future. TODO: - [x] Update outdated comments in `newParquet.scala`.  [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4563)  Author: Cheng Lian <lian@databricks.com> Closes #4563 from liancheng/parquet-refining and squashes the following commits: fa98d27 [Cheng Lian] Fixes test cases which should disable off Parquet data source 2476e82 [Cheng Lian] Fixes compilation error introduced during rebasing a83d290 [Cheng Lian] Passes Hive Metastore partitioning information to ParquetRelation2

Diffstat (limited to 'streaming')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: