diff options
author | hyukjinkwon <gurwls223@gmail.com> | 2016-05-07 01:46:45 +0800 |
---|---|---|
committer | Cheng Lian <lian@databricks.com> | 2016-05-07 01:46:45 +0800 |
commit | fa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087 (patch) | |
tree | b8bd870a6befb6dd0908eb97002f36ec41d38dfe /docs/configuration.md | |
parent | a03c5e68abd8c066c97ebd388883070d59dce1a7 (diff) | |
download | spark-fa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087.tar.gz spark-fa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087.tar.bz2 spark-fa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087.zip |
[SPARK-14962][SQL] Do not push down isnotnull/isnull on unsuportted types in ORC
## What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-14962
ORC filters were being pushed down for all types for both `IsNull` and `IsNotNull`.
This is apparently OK because both `IsNull` and `IsNotNull` do not take a type as an argument (Hive 1.2.x) during building filters (`SearchArgument`) in Spark-side but they do not filter correctly because stored statistics always produces `null` for not supported types (eg `ArrayType`) in ORC-side. So, it is always `true` for `IsNull` which ends up with always `false` for `IsNotNull`. (Please see [RecordReaderImpl.java#L296-L318](https://github.com/apache/hive/blob/branch-1.2/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java#L296-L318) and [RecordReaderImpl.java#L359-L365](https://github.com/apache/hive/blob/branch-1.2/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java#L359-L365) in Hive 1.2)
This looks prevented in Hive 1.3.x >= by forcing to give a type ([`PredicateLeaf.Type`](https://github.com/apache/hive/blob/e085b7e9bd059d91aaf013df0db4d71dca90ec6f/storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java#L50-L56)) when building a filter ([`SearchArgument`](https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java#L260)) but Hive 1.2.x seems not doing this.
This PR prevents ORC filter creation for `IsNull` and `IsNotNull` on unsupported types. `OrcFilters` resembles `ParquetFilters`.
## How was this patch tested?
Unittests in `OrcQuerySuite` and `OrcFilterSuite` and `sbt scalastyle`.
Author: hyukjinkwon <gurwls223@gmail.com>
Author: Hyukjin Kwon <gurwls223@gmail.com>
Closes #12777 from HyukjinKwon/SPARK-14962.
Diffstat (limited to 'docs/configuration.md')
0 files changed, 0 insertions, 0 deletions