[SPARK-18874][SQL] First phase: Deferring the correlated predicate pull up to Optimizer phase

## What changes were proposed in this pull request? Currently Analyzer as part of ResolveSubquery, pulls up the correlated predicates to its originating SubqueryExpression. The subquery plan is then transformed to remove the correlated predicates after they are moved up to the outer plan. In this PR, the task of pulling up correlated predicates is deferred to Optimizer. This is the initial work that will allow us to support the form of correlated subqueries that we don't support today. The design document from nsyca can be found in the following link : [DesignDoc](https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit#) The brief description of code changes (hopefully to aid with code review) can be be found in the following link: [CodeChanges](https://docs.google.com/document/d/18mqjhL9V1An-tNta7aVE13HkALRZ5GZ24AATA-Vqqf0/edit#) ## How was this patch tested? The test case PRs were submitted earlier using. [16337](https://github.com/apache/spark/pull/16337) [16759](https://github.com/apache/spark/pull/16759) [16841](https://github.com/apache/spark/pull/16841) [16915](https://github.com/apache/spark/pull/16915) [16798](https://github.com/apache/spark/pull/16798) [16712](https://github.com/apache/spark/pull/16712) [16710](https://github.com/apache/spark/pull/16710) [16760](https://github.com/apache/spark/pull/16760) [16802](https://github.com/apache/spark/pull/16802) Author: Dilip Biswal <dbiswal@us.ibm.com> Closes #16954 from dilipbiswal/SPARK-18874.
author: Nattavut Sutyanyong <nsy.can@gmail.com> 2017-03-14 10:37:10 +0100
committer: Herman van Hovell <hvanhovell@databricks.com> 2017-03-14 10:37:10 +0100
commit: 4ce970d71488c7de6025ef925f75b8b92a5a6a79 (patch)
tree: 2857e3a5c373359042796ca662769b786bc83fbf /sql/core/src/test/resources/sql-tests/results
parent: f6314eab4b494bd5b5e9e41c6f582d4f22c0967a (diff)
download: spark-4ce970d71488c7de6025ef925f75b8b92a5a6a79.tar.gz
spark-4ce970d71488c7de6025ef925f75b8b92a5a6a79.tar.bz2
spark-4ce970d71488c7de6025ef925f75b8b92a5a6a79.zip
1 files changed, 2 insertions, 2 deletions
diff --git a/sql/core/src/test/resources/sql-tests/results/subquery/negative-cases/invalid-correlation.sql.out b/sql/core/src/test/resources/sql-tests/results/subquery/negative-cases/invalid-correlation.sql.out
index 50ae01e181..f7bbb35aad 100644
--- a/sql/core/src/test/resources/sql-tests/results/subquery/negative-cases/invalid-correlation.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/subquery/negative-cases/invalid-correlation.sql.out
@@ -46,7 +46,7 @@ and    t2b = (select max(avg)
 struct<>
 -- !query 3 output
 org.apache.spark.sql.AnalysisException
-expression 't2.`t2b`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;
+grouping expressions sequence is empty, and 't2.`t2b`' is not an aggregate function. Wrap '(avg(CAST(t2.`t2b` AS BIGINT)) AS `avg`)' in windowing function(s) or wrap 't2.`t2b`' in first() (or first_value) if you don't care which value you get.;
 
 
 -- !query 4
@@ -63,4 +63,4 @@ where  t1a in (select   min(t2a)
 struct<>
 -- !query 4 output
 org.apache.spark.sql.AnalysisException
-resolved attribute(s) t2b#x missing from min(t2a)#x,t2c#x in operator !Filter predicate-subquery#x [(t2c#x = max(t3c)#x) && (t3b#x > t2b#x)];
+resolved attribute(s) t2b#x missing from min(t2a)#x,t2c#x in operator !Filter t2c#x IN (list#x [t2b#x]);
author	Nattavut Sutyanyong <nsy.can@gmail.com>	2017-03-14 10:37:10 +0100
committer	Herman van Hovell <hvanhovell@databricks.com>	2017-03-14 10:37:10 +0100
commit	4ce970d71488c7de6025ef925f75b8b92a5a6a79 (patch)
tree	2857e3a5c373359042796ca662769b786bc83fbf /sql/core/src/test/resources/sql-tests/results
parent	f6314eab4b494bd5b5e9e41c6f582d4f22c0967a (diff)
download	spark-4ce970d71488c7de6025ef925f75b8b92a5a6a79.tar.gz spark-4ce970d71488c7de6025ef925f75b8b92a5a6a79.tar.bz2 spark-4ce970d71488c7de6025ef925f75b8b92a5a6a79.zip