diff options
author | wangfei <wangfei1@huawei.com> | 2015-01-29 15:47:13 -0800 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2015-01-29 15:47:18 -0800 |
commit | c1b3eebf97b986439f71afd3c4eccf47b90da2cd (patch) | |
tree | 67b08fa72d48327bf27146d2383d24be4675e1c3 /sql/catalyst | |
parent | fbaf9e08961551d3ae5c3629eca01e839b001b8e (diff) | |
download | spark-c1b3eebf97b986439f71afd3c4eccf47b90da2cd.tar.gz spark-c1b3eebf97b986439f71afd3c4eccf47b90da2cd.tar.bz2 spark-c1b3eebf97b986439f71afd3c4eccf47b90da2cd.zip |
[SPARK-5373][SQL] Literal in agg grouping expressions leads to incorrect result
`select key, count( * ) from src group by key, 1` will get the wrong answer.
e.g. for this table
```
val testData2 =
TestSQLContext.sparkContext.parallelize(
TestData2(1, 1) ::
TestData2(1, 2) ::
TestData2(2, 1) ::
TestData2(2, 2) ::
TestData2(3, 1) ::
TestData2(3, 2) :: Nil, 2).toSchemaRDD
testData2.registerTempTable("testData2")
```
result of `SELECT a, count(1) FROM testData2 GROUP BY a, 1` is
```
[1,1]
[2,2]
[3,1]
```
Author: wangfei <wangfei1@huawei.com>
Closes #4169 from scwf/agg-bug and squashes the following commits:
05751db [wangfei] fix bugs when literal in agg grouping expressioons
Diffstat (limited to 'sql/catalyst')
-rw-r--r-- | sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala | 9 |
1 files changed, 5 insertions, 4 deletions
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala index 310d127506..b4c445b3ba 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala @@ -141,10 +141,11 @@ object PartialAggregation { // We need to pass all grouping expressions though so the grouping can happen a second // time. However some of them might be unnamed so we alias them allowing them to be // referenced in the second aggregation. - val namedGroupingExpressions: Map[Expression, NamedExpression] = groupingExpressions.map { - case n: NamedExpression => (n, n) - case other => (other, Alias(other, "PartialGroup")()) - }.toMap + val namedGroupingExpressions: Map[Expression, NamedExpression] = + groupingExpressions.filter(!_.isInstanceOf[Literal]).map { + case n: NamedExpression => (n, n) + case other => (other, Alias(other, "PartialGroup")()) + }.toMap // Replace aggregations with a new expression that computes the result from the already // computed partial evaluations and grouping values. |