diff options
author | gatorsmile <gatorsmile@gmail.com> | 2016-02-11 08:40:27 +0100 |
---|---|---|
committer | Herman van Hovell <hvanhovell@questtec.nl> | 2016-02-11 08:40:27 +0100 |
commit | e88bff12795a6134e2e7204996b603e948380e18 (patch) | |
tree | d4c5b19801ebfce2e08b5b4bfee08b679f3f3f6a /sql/README.md | |
parent | 1842c55d89ae99a610a955ce61633a9084e000f2 (diff) | |
download | spark-e88bff12795a6134e2e7204996b603e948380e18.tar.gz spark-e88bff12795a6134e2e7204996b603e948380e18.tar.bz2 spark-e88bff12795a6134e2e7204996b603e948380e18.zip |
[SPARK-13235][SQL] Removed an Extra Distinct from the Plan when Using Union in SQL
Currently, the parser added two `Distinct` operators in the plan if we are using `Union` or `Union Distinct` in the SQL. This PR is to remove the extra `Distinct` from the plan.
For example, before the fix, the following query has a plan with two `Distinct`
```scala
sql("select * from t0 union select * from t0").explain(true)
```
```
== Parsed Logical Plan ==
'Project [unresolvedalias(*,None)]
+- 'Subquery u_2
+- 'Distinct
+- 'Project [unresolvedalias(*,None)]
+- 'Subquery u_1
+- 'Distinct
+- 'Union
:- 'Project [unresolvedalias(*,None)]
: +- 'UnresolvedRelation `t0`, None
+- 'Project [unresolvedalias(*,None)]
+- 'UnresolvedRelation `t0`, None
== Analyzed Logical Plan ==
id: bigint
Project [id#16L]
+- Subquery u_2
+- Distinct
+- Project [id#16L]
+- Subquery u_1
+- Distinct
+- Union
:- Project [id#16L]
: +- Subquery t0
: +- Relation[id#16L] ParquetRelation
+- Project [id#16L]
+- Subquery t0
+- Relation[id#16L] ParquetRelation
== Optimized Logical Plan ==
Aggregate [id#16L], [id#16L]
+- Aggregate [id#16L], [id#16L]
+- Union
:- Project [id#16L]
: +- Relation[id#16L] ParquetRelation
+- Project [id#16L]
+- Relation[id#16L] ParquetRelation
```
After the fix, the plan is changed without the extra `Distinct` as follows:
```
== Parsed Logical Plan ==
'Project [unresolvedalias(*,None)]
+- 'Subquery u_1
+- 'Distinct
+- 'Union
:- 'Project [unresolvedalias(*,None)]
: +- 'UnresolvedRelation `t0`, None
+- 'Project [unresolvedalias(*,None)]
+- 'UnresolvedRelation `t0`, None
== Analyzed Logical Plan ==
id: bigint
Project [id#17L]
+- Subquery u_1
+- Distinct
+- Union
:- Project [id#16L]
: +- Subquery t0
: +- Relation[id#16L] ParquetRelation
+- Project [id#16L]
+- Subquery t0
+- Relation[id#16L] ParquetRelation
== Optimized Logical Plan ==
Aggregate [id#17L], [id#17L]
+- Union
:- Project [id#16L]
: +- Relation[id#16L] ParquetRelation
+- Project [id#16L]
+- Relation[id#16L] ParquetRelation
```
Author: gatorsmile <gatorsmile@gmail.com>
Closes #11120 from gatorsmile/unionDistinct.
Diffstat (limited to 'sql/README.md')
0 files changed, 0 insertions, 0 deletions