aboutsummaryrefslogtreecommitdiff
path: root/project/SparkBuild.scala
diff options
context:
space:
mode:
authorYash Datta <Yash.Datta@guavus.com>2014-10-30 17:17:24 -0700
committerMichael Armbrust <michael@databricks.com>2014-10-30 17:17:31 -0700
commit2e35e24294ad8a5e76c89ea888fe330052dabd5a (patch)
tree4a04c807efa3e346e07aeba52593a20a745284a7 /project/SparkBuild.scala
parent9b6ebe33db27be38c3036ffeda17096043fb0fb9 (diff)
downloadspark-2e35e24294ad8a5e76c89ea888fe330052dabd5a.tar.gz
spark-2e35e24294ad8a5e76c89ea888fe330052dabd5a.tar.bz2
spark-2e35e24294ad8a5e76c89ea888fe330052dabd5a.zip
[SPARK-3968][SQL] Use parquet-mr filter2 api
The parquet-mr project has introduced a new filter api (https://github.com/apache/incubator-parquet-mr/pull/4), along with several fixes . It can also eliminate entire RowGroups depending on certain statistics like min/max We can leverage that to further improve performance of queries with filters. Also filter2 api introduces ability to create custom filters. We can create a custom filter for the optimized In clause (InSet) , so that elimination happens in the ParquetRecordReader itself Author: Yash Datta <Yash.Datta@guavus.com> Closes #2841 from saucam/master and squashes the following commits: 8282ba0 [Yash Datta] SPARK-3968: fix scala code style and add some more tests for filtering on optional columns 515df1c [Yash Datta] SPARK-3968: Add a test case for filter pushdown on optional column 5f4530e [Yash Datta] SPARK-3968: Fix scala code style f304667 [Yash Datta] SPARK-3968: Using task metadata strategy for row group filtering ec53e92 [Yash Datta] SPARK-3968: No push down should result in case we are unable to create a record filter 48163c3 [Yash Datta] SPARK-3968: Code cleanup cc7b596 [Yash Datta] SPARK-3968: 1. Fix RowGroupFiltering not working 2. Use the serialization/deserialization from Parquet library for filter pushdown caed851 [Yash Datta] Revert "SPARK-3968: Not pushing the filters in case of OPTIONAL columns" since filtering on optional columns is now supported in filter2 api 49703c9 [Yash Datta] SPARK-3968: Not pushing the filters in case of OPTIONAL columns 9d09741 [Yash Datta] SPARK-3968: Change parquet filter pushdown to use filter2 api of parquet-mr
Diffstat (limited to 'project/SparkBuild.scala')
0 files changed, 0 insertions, 0 deletions