aboutsummaryrefslogtreecommitdiff
path: root/project/MimaExcludes.scala
diff options
context:
space:
mode:
authorCheng Lian <lian@databricks.com>2015-05-13 11:04:10 -0700
committerMichael Armbrust <michael@databricks.com>2015-05-13 11:04:10 -0700
commit7ff16e8abef9fbf4a4855e23c256b22e62e560a6 (patch)
tree1be1249ecb9db02ef5bf8820f7c44a7fbe71a6ff /project/MimaExcludes.scala
parentbec938f777a2e18757c7d04504d86a5342e2b49e (diff)
downloadspark-7ff16e8abef9fbf4a4855e23c256b22e62e560a6.tar.gz
spark-7ff16e8abef9fbf4a4855e23c256b22e62e560a6.tar.bz2
spark-7ff16e8abef9fbf4a4855e23c256b22e62e560a6.zip
[SPARK-7567] [SQL] Migrating Parquet data source to FSBasedRelation
This PR migrates Parquet data source to the newly introduced `FSBasedRelation`. `FSBasedParquetRelation` is created to replace `ParquetRelation2`. Major differences are: 1. Partition discovery code has been factored out to `FSBasedRelation` 1. `AppendingParquetOutputFormat` is not used now. Instead, an anonymous subclass of `ParquetOutputFormat` is used to handle appending and writing dynamic partitions 1. When scanning partitioned tables, `FSBasedParquetRelation.buildScan` only builds an `RDD[Row]` for a single selected partition 1. `FSBasedParquetRelation` doesn't rely on Catalyst expressions for filter push down, thus it doesn't extend `CatalystScan` anymore After migrating `JSONRelation` (which extends `CatalystScan`), we can remove `CatalystScan`. <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/6090) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes #6090 from liancheng/parquet-migration and squashes the following commits: 6063f87 [Cheng Lian] Casts to OutputCommitter rather than FileOutputCommtter bfd1cf0 [Cheng Lian] Fixes compilation error introduced while rebasing f9ea56e [Cheng Lian] Adds ParquetRelation2 related classes to MiMa check whitelist 261d8c1 [Cheng Lian] Minor bug fix and more tests db65660 [Cheng Lian] Migrates Parquet data source to FSBasedRelation
Diffstat (limited to 'project/MimaExcludes.scala')
-rw-r--r--project/MimaExcludes.scala6
1 files changed, 6 insertions, 0 deletions
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index a47e29e2ef..f31f0e554e 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -111,6 +111,12 @@ object MimaExcludes {
"org.apache.spark.sql.parquet.ParquetRelation2$PartitionValues"),
ProblemFilters.exclude[MissingClassProblem](
"org.apache.spark.sql.parquet.ParquetRelation2$PartitionValues$"),
+ ProblemFilters.exclude[MissingClassProblem](
+ "org.apache.spark.sql.parquet.ParquetRelation2"),
+ ProblemFilters.exclude[MissingClassProblem](
+ "org.apache.spark.sql.parquet.ParquetRelation2$"),
+ ProblemFilters.exclude[MissingClassProblem](
+ "org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache"),
// These test support classes were moved out of src/main and into src/test:
ProblemFilters.exclude[MissingClassProblem](
"org.apache.spark.sql.parquet.ParquetTestData"),