[SPARK-17034][SQL] adds expression UnresolvedOrdinal to represent the ordinals in GROUP BY or ORDER BY

## What changes were proposed in this pull request? This PR adds expression `UnresolvedOrdinal` to represent the ordinal in GROUP BY or ORDER BY, and fixes the rules when resolving ordinals. Ordinals in GROUP BY or ORDER BY like `1` in `order by 1` or `group by 1` should be considered as unresolved before analysis. But in current code, it uses `Literal` expression to store the ordinal. This is inappropriate as `Literal` itself is a resolved expression, it gives the user a wrong message that the ordinals has already been resolved. ### Before this change Ordinal is stored as `Literal` expression ``` scala> sc.setLogLevel("TRACE") scala> sql("select a from t group by 1 order by 1") ... 'Sort [1 ASC], true +- 'Aggregate [1], ['a] +- 'UnresolvedRelation `t ``` For query: ``` scala> Seq(1).toDF("a").createOrReplaceTempView("t") scala> sql("select count(a), a from t group by 2 having a > 0").show ``` During analysis, the intermediate plan before applying rule `ResolveAggregateFunctions` is: ``` 'Filter ('a > 0) +- Aggregate [2], [count(1) AS count(1)#83L, a#81] +- LocalRelation [value#7 AS a#9] ``` Before this PR, rule `ResolveAggregateFunctions` believes all expressions of `Aggregate` have already been resolved, and tries to resolve the expressions in `Filter` directly. But this is wrong, as ordinal `2` in Aggregate is not really resolved! ### After this change Ordinals are stored as `UnresolvedOrdinal`. ``` scala> sc.setLogLevel("TRACE") scala> sql("select a from t group by 1 order by 1") ... 'Sort [unresolvedordinal(1) ASC], true +- 'Aggregate [unresolvedordinal(1)], ['a] +- 'UnresolvedRelation `t` ``` ## How was this patch tested? Unit tests. Author: Sean Zhong <seanzhong@databricks.com> Closes #14616 from clockfly/spark-16955.
author: Sean Zhong <seanzhong@databricks.com> 2016-08-16 15:51:30 +0800
committer: Wenchen Fan <wenchen@databricks.com> 2016-08-16 15:51:30 +0800
commit: 7b65030e7a0af3a0bd09370fb069d659b36ff7f0 (patch)
tree: c820a00facee5871059e8412c23125823944b838 /sql/core/src/test/resources/sql-tests
parent: 7de30d6e9e5d3020d2ba8c2ce08893d9cd822b56 (diff)
download: spark-7b65030e7a0af3a0bd09370fb069d659b36ff7f0.tar.gz
spark-7b65030e7a0af3a0bd09370fb069d659b36ff7f0.tar.bz2
spark-7b65030e7a0af3a0bd09370fb069d659b36ff7f0.zip
2 files changed, 28 insertions, 6 deletions
diff --git a/sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql b/sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql
index 36b469c617..9c8d851e36 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql
@@ -43,6 +43,12 @@ select a, rand(0), sum(b) from data group by a, 2;
 -- negative case: star
 select * from data group by a, b, 1;
 
+-- group by ordinal followed by order by
+select a, count(a) from (select 1 as a) tmp group by 1 order by 1;
+
+-- group by ordinal followed by having
+select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
+
 -- turn of group by ordinal
 set spark.sql.groupByOrdinal=false;
 
diff --git a/sql/core/src/test/resources/sql-tests/results/group-by-ordinal.sql.out b/sql/core/src/test/resources/sql-tests/results/group-by-ordinal.sql.out
index 2f10b7ebc6..9c3a145f3a 100644
--- a/sql/core/src/test/resources/sql-tests/results/group-by-ordinal.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/group-by-ordinal.sql.out
@@ -1,5 +1,5 @@
 -- Automatically generated by SQLQueryTestSuite
--- Number of queries: 17
+-- Number of queries: 19
 
 
 -- !query 0
@@ -153,16 +153,32 @@ Star (*) is not allowed in select list when GROUP BY ordinal position is used;
 
 
 -- !query 15
-set spark.sql.groupByOrdinal=false
+select a, count(a) from (select 1 as a) tmp group by 1 order by 1
 -- !query 15 schema
-struct<key:string,value:string>
+struct<a:int,count(a):bigint>
 -- !query 15 output
-spark.sql.groupByOrdinal
+1	1
 
 
 -- !query 16
-select sum(b) from data group by -1
+select count(a), a from (select 1 as a) tmp group by 2 having a > 0
 -- !query 16 schema
-struct<sum(b):bigint>
+struct<count(a):bigint,a:int>
 -- !query 16 output
+1	1
+
+
+-- !query 17
+set spark.sql.groupByOrdinal=false
+-- !query 17 schema
+struct<key:string,value:string>
+-- !query 17 output
+spark.sql.groupByOrdinal
+
+
+-- !query 18
+select sum(b) from data group by -1
+-- !query 18 schema
+struct<sum(b):bigint>
+-- !query 18 output
 9
author	Sean Zhong <seanzhong@databricks.com>	2016-08-16 15:51:30 +0800
committer	Wenchen Fan <wenchen@databricks.com>	2016-08-16 15:51:30 +0800
commit	7b65030e7a0af3a0bd09370fb069d659b36ff7f0 (patch)
tree	c820a00facee5871059e8412c23125823944b838 /sql/core/src/test/resources/sql-tests
parent	7de30d6e9e5d3020d2ba8c2ce08893d9cd822b56 (diff)
download	spark-7b65030e7a0af3a0bd09370fb069d659b36ff7f0.tar.gz spark-7b65030e7a0af3a0bd09370fb069d659b36ff7f0.tar.bz2 spark-7b65030e7a0af3a0bd09370fb069d659b36ff7f0.zip