diff options
author | Xiao Li <gatorsmile@gmail.com> | 2017-03-29 12:43:22 -0700 |
---|---|---|
committer | Xiao Li <gatorsmile@gmail.com> | 2017-03-29 12:43:22 -0700 |
commit | 5c8ef376e874497766ba0cc4d97429e33a3d9c61 (patch) | |
tree | de98b70e04c791c15d3729f32cd5e3ae5624ab75 /pom.xml | |
parent | c4008480b781379ac0451b9220300d83c054c60d (diff) | |
download | spark-5c8ef376e874497766ba0cc4d97429e33a3d9c61.tar.gz spark-5c8ef376e874497766ba0cc4d97429e33a3d9c61.tar.bz2 spark-5c8ef376e874497766ba0cc4d97429e33a3d9c61.zip |
[SPARK-17075][SQL][FOLLOWUP] Add Estimation of Constant Literal
### What changes were proposed in this pull request?
`FalseLiteral` and `TrueLiteral` should have been eliminated by optimizer rule `BooleanSimplification`, but null literals might be added by optimizer rule `NullPropagation`. For safety, our filter estimation should handle all the eligible literal cases.
Our optimizer rule BooleanSimplification is unable to remove the null literal in many cases. For example, `a < 0 or null`. Thus, we need to handle null literal in filter estimation.
`Not` can be pushed down below `And` and `Or`. Then, we could see two consecutive `Not`, which need to be collapsed into one. Because of the limited expression support for filter estimation, we just need to handle the case `Not(null)` for avoiding incorrect error due to the boolean operation on null. For details, see below matrix.
```
not NULL = NULL
NULL or false = NULL
NULL or true = true
NULL or NULL = NULL
NULL and false = false
NULL and true = NULL
NULL and NULL = NULL
```
### How was this patch tested?
Added the test cases.
Author: Xiao Li <gatorsmile@gmail.com>
Closes #17446 from gatorsmile/constantFilterEstimation.
Diffstat (limited to 'pom.xml')
0 files changed, 0 insertions, 0 deletions