diff options
author | gatorsmile <gatorsmile@gmail.com> | 2016-03-22 08:21:02 +0800 |
---|---|---|
committer | Wenchen Fan <wenchen@databricks.com> | 2016-03-22 08:21:02 +0800 |
commit | 3f49e0766f3a369a44e14632de68c657773b7a27 (patch) | |
tree | 98a493a351dc476656eff031d4f97109ffeed0e0 /examples | |
parent | b5f1ab701a167a728bb006e01b392b203da84391 (diff) | |
download | spark-3f49e0766f3a369a44e14632de68c657773b7a27.tar.gz spark-3f49e0766f3a369a44e14632de68c657773b7a27.tar.bz2 spark-3f49e0766f3a369a44e14632de68c657773b7a27.zip |
[SPARK-13320][SQL] Support Star in CreateStruct/CreateArray and Error Handling when DataFrame/DataSet Functions using Star
This PR resolves two issues:
First, expanding * inside aggregate functions of structs when using Dataframe/Dataset APIs. For example,
```scala
structDf.groupBy($"a").agg(min(struct($"record.*")))
```
Second, it improves the error messages when having invalid star usage when using Dataframe/Dataset APIs. For example,
```scala
pagecounts4PartitionsDS
.map(line => (line._1, line._3))
.toDF()
.groupBy($"_1")
.agg(sum("*") as "sumOccurances")
```
Before the fix, the invalid usage will issue a confusing error message, like:
```
org.apache.spark.sql.AnalysisException: cannot resolve '_1' given input columns _1, _2;
```
After the fix, the message is like:
```
org.apache.spark.sql.AnalysisException: Invalid usage of '*' in function 'sum'
```
cc: rxin nongli cloud-fan
Author: gatorsmile <gatorsmile@gmail.com>
Closes #11208 from gatorsmile/sumDataSetResolution.
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions