[SPARK-13221] [SQL] Fixing GroupingSets when Aggregate Functions Containing GroupBy Columns - spark

diff options

author	gatorsmile <gatorsmile@gmail.com>	2016-02-15 23:16:58 -0800
committer	Davies Liu <davies.liu@gmail.com>	2016-02-15 23:16:58 -0800
commit	fee739f07b3bc37dd65682e93e60e0add848f583 (patch)
tree	f628025468812328819b3fe3a00b21390f00780e /external
parent	e4675c240255207c5dd812fa657e6aca2dc9cfeb (diff)
download	spark-fee739f07b3bc37dd65682e93e60e0add848f583.tar.gz spark-fee739f07b3bc37dd65682e93e60e0add848f583.tar.bz2 spark-fee739f07b3bc37dd65682e93e60e0add848f583.zip

[SPARK-13221] [SQL] Fixing GroupingSets when Aggregate Functions Containing GroupBy Columns

Using GroupingSets will generate a wrong result when Aggregate Functions containing GroupBy columns. This PR is to fix it. Since the code changes are very small. Maybe we also can merge it to 1.6 For example, the following query returns a wrong result: ```scala sql("select course, sum(earnings) as sum from courseSales group by course, earnings" + " grouping sets((), (course), (course, earnings))" + " order by course, sum").show() ``` Before the fix, the results are like ``` [null,null] [Java,null] [Java,20000.0] [Java,30000.0] [dotNET,null] [dotNET,5000.0] [dotNET,10000.0] [dotNET,48000.0] ``` After the fix, the results become correct: ``` [null,113000.0] [Java,20000.0] [Java,30000.0] [Java,50000.0] [dotNET,5000.0] [dotNET,10000.0] [dotNET,48000.0] [dotNET,63000.0] ``` UPDATE: This PR also deprecated the external column: GROUPING__ID. Author: gatorsmile <gatorsmile@gmail.com> Closes #11100 from gatorsmile/groupingSets.

Diffstat (limited to 'external')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: