diff options
author | Cheng Lian <lian@databricks.com> | 2014-11-14 15:09:36 -0800 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2014-11-14 15:09:36 -0800 |
commit | 0c7b66bd449093bb5d2dafaf91d54e63e601e320 (patch) | |
tree | 598c2985d9281a75fccbfd55e8ca06cd910955c7 /docs/java-programming-guide.md | |
parent | 4b4b50c9e596673c1534df97effad50d107a8007 (diff) | |
download | spark-0c7b66bd449093bb5d2dafaf91d54e63e601e320.tar.gz spark-0c7b66bd449093bb5d2dafaf91d54e63e601e320.tar.bz2 spark-0c7b66bd449093bb5d2dafaf91d54e63e601e320.zip |
[SPARK-4322][SQL] Enables struct fields as sub expressions of grouping fields
While resolving struct fields, the resulted `GetField` expression is wrapped with an `Alias` to make it a named expression. Assume `a` is a struct instance with a field `b`, then `"a.b"` will be resolved as `Alias(GetField(a, "b"), "b")`. Thus, for this following SQL query:
```sql
SELECT a.b + 1 FROM t GROUP BY a.b + 1
```
the grouping expression is
```scala
Add(GetField(a, "b"), Literal(1, IntegerType))
```
while the aggregation expression is
```scala
Add(Alias(GetField(a, "b"), "b"), Literal(1, IntegerType))
```
This mismatch makes the above SQL query fail during the both analysis and execution phases. This PR fixes this issue by removing the alias when substituting aggregation expressions.
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3248)
<!-- Reviewable:end -->
Author: Cheng Lian <lian@databricks.com>
Closes #3248 from liancheng/spark-4322 and squashes the following commits:
23a46ea [Cheng Lian] Code simplification
dd20a79 [Cheng Lian] Should only trim aliases around `GetField`s
7f46532 [Cheng Lian] Enables struct fields as sub expressions of grouping fields
Diffstat (limited to 'docs/java-programming-guide.md')
0 files changed, 0 insertions, 0 deletions