diff options
author | Herman van Hövell tot Westerflier <hvanhovell@questtec.nl> | 2016-06-12 21:30:32 -0700 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2016-06-12 21:30:32 -0700 |
commit | 1f8f2b5c2a33e63367ea4881b5918f6bc0a6f52f (patch) | |
tree | 5d35fcdd61d1fc2eb2554d55db291b9d5248707f /examples/src/main/resources/users.parquet | |
parent | f5d38c39255cc75325c6639561bfec1bc051f788 (diff) | |
download | spark-1f8f2b5c2a33e63367ea4881b5918f6bc0a6f52f.tar.gz spark-1f8f2b5c2a33e63367ea4881b5918f6bc0a6f52f.tar.bz2 spark-1f8f2b5c2a33e63367ea4881b5918f6bc0a6f52f.zip |
[SPARK-15370][SQL] Fix count bug
# What changes were proposed in this pull request?
This pull request fixes the COUNT bug in the `RewriteCorrelatedScalarSubquery` rule.
After this change, the rule tests the expression at the root of the correlated subquery to determine whether the expression returns `NULL` on empty input. If the expression does not return `NULL`, the rule generates additional logic in the `Project` operator above the rewritten subquery. This additional logic intercepts `NULL` values coming from the outer join and replaces them with the value that the subquery's expression would return on empty input.
This PR takes over https://github.com/apache/spark/pull/13155. It only fixes an issue with `Literal` construction and style issues. All credits should go frreiss.
# How was this patch tested?
Added regression tests to cover all branches of the updated rule (see changes to `SubquerySuite`).
Ran all existing automated regression tests after merging with latest trunk.
Author: frreiss <frreiss@us.ibm.com>
Author: Herman van Hovell <hvanhovell@databricks.com>
Closes #13629 from hvanhovell/SPARK-15370-cleanup.
Diffstat (limited to 'examples/src/main/resources/users.parquet')
0 files changed, 0 insertions, 0 deletions