aboutsummaryrefslogtreecommitdiff
path: root/examples
diff options
context:
space:
mode:
authorhyukjinkwon <gurwls223@gmail.com>2016-05-07 01:46:45 +0800
committerCheng Lian <lian@databricks.com>2016-05-07 01:46:45 +0800
commitfa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087 (patch)
treeb8bd870a6befb6dd0908eb97002f36ec41d38dfe /examples
parenta03c5e68abd8c066c97ebd388883070d59dce1a7 (diff)
downloadspark-fa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087.tar.gz
spark-fa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087.tar.bz2
spark-fa928ff9a3c1de5d5aff9d14e6bc1bd03fcca087.zip
[SPARK-14962][SQL] Do not push down isnotnull/isnull on unsuportted types in ORC
## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14962 ORC filters were being pushed down for all types for both `IsNull` and `IsNotNull`. This is apparently OK because both `IsNull` and `IsNotNull` do not take a type as an argument (Hive 1.2.x) during building filters (`SearchArgument`) in Spark-side but they do not filter correctly because stored statistics always produces `null` for not supported types (eg `ArrayType`) in ORC-side. So, it is always `true` for `IsNull` which ends up with always `false` for `IsNotNull`. (Please see [RecordReaderImpl.java#L296-L318](https://github.com/apache/hive/blob/branch-1.2/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java#L296-L318) and [RecordReaderImpl.java#L359-L365](https://github.com/apache/hive/blob/branch-1.2/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java#L359-L365) in Hive 1.2) This looks prevented in Hive 1.3.x >= by forcing to give a type ([`PredicateLeaf.Type`](https://github.com/apache/hive/blob/e085b7e9bd059d91aaf013df0db4d71dca90ec6f/storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java#L50-L56)) when building a filter ([`SearchArgument`](https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java#L260)) but Hive 1.2.x seems not doing this. This PR prevents ORC filter creation for `IsNull` and `IsNotNull` on unsupported types. `OrcFilters` resembles `ParquetFilters`. ## How was this patch tested? Unittests in `OrcQuerySuite` and `OrcFilterSuite` and `sbt scalastyle`. Author: hyukjinkwon <gurwls223@gmail.com> Author: Hyukjin Kwon <gurwls223@gmail.com> Closes #12777 from HyukjinKwon/SPARK-14962.
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions