diff options
author | Nattavut Sutyanyong <nsy.can@gmail.com> | 2016-11-22 12:06:21 -0800 |
---|---|---|
committer | Herman van Hovell <hvanhovell@databricks.com> | 2016-11-22 12:06:21 -0800 |
commit | 45ea46b7b397f023b4da878eb11e21b08d931115 (patch) | |
tree | 51be6bfe31812109263bac69f947ef315b5c084c /external/kafka-0-8/src/main | |
parent | bb152cdfbb8d02130c71d2326ae81939725c2cf0 (diff) | |
download | spark-45ea46b7b397f023b4da878eb11e21b08d931115.tar.gz spark-45ea46b7b397f023b4da878eb11e21b08d931115.tar.bz2 spark-45ea46b7b397f023b4da878eb11e21b08d931115.zip |
[SPARK-18504][SQL] Scalar subquery with extra group by columns returning incorrect result
## What changes were proposed in this pull request?
This PR blocks an incorrect result scenario in scalar subquery where there are GROUP BY column(s)
that are not part of the correlated predicate(s).
Example:
// Incorrect result
Seq(1).toDF("c1").createOrReplaceTempView("t1")
Seq((1,1),(1,2)).toDF("c1","c2").createOrReplaceTempView("t2")
sql("select (select sum(-1) from t2 where t1.c1=t2.c1 group by t2.c2) from t1").show
// How can selecting a scalar subquery from a 1-row table return 2 rows?
## How was this patch tested?
sql/test, catalyst/test
new test case covering the reported problem is added to SubquerySuite.scala
Author: Nattavut Sutyanyong <nsy.can@gmail.com>
Closes #15936 from nsyca/scalarSubqueryIncorrect-1.
Diffstat (limited to 'external/kafka-0-8/src/main')
0 files changed, 0 insertions, 0 deletions