aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql
diff options
context:
space:
mode:
authorgatorsmile <gatorsmile@gmail.com>2016-03-21 10:34:54 +0800
committerWenchen Fan <wenchen@databricks.com>2016-03-21 10:34:54 +0800
commitf58319a24fd5e026411538b1fb7336d9d894277b (patch)
treec109f1f3e23cacf6f0cb4f0d2c78088f5514f531 /python/pyspark/sql
parent454a00df2a43176cb774cad7277934a775618db1 (diff)
downloadspark-f58319a24fd5e026411538b1fb7336d9d894277b.tar.gz
spark-f58319a24fd5e026411538b1fb7336d9d894277b.tar.bz2
spark-f58319a24fd5e026411538b1fb7336d9d894277b.zip
[SPARK-14019][SQL] Remove noop SortOrder in Sort
#### What changes were proposed in this pull request? This PR is to add a new Optimizer rule for pruning Sort if its SortOrder is no-op. In the phase of **Optimizer**, if a specific `SortOrder` does not have any reference, it has no effect on the sorting results. If `Sort` is empty, remove the whole `Sort`. For example, in the following SQL query ```SQL SELECT * FROM t ORDER BY NULL + 5 ``` Before the fix, the plan is like ``` == Analyzed Logical Plan == a: int, b: int Sort [(cast(null as int) + 5) ASC], true +- Project [a#92,b#93] +- SubqueryAlias t +- Project [_1#89 AS a#92,_2#90 AS b#93] +- LocalRelation [_1#89,_2#90], [[1,2],[1,2]] == Optimized Logical Plan == Sort [null ASC], true +- LocalRelation [a#92,b#93], [[1,2],[1,2]] == Physical Plan == WholeStageCodegen : +- Sort [null ASC], true, 0 : +- INPUT +- Exchange rangepartitioning(null ASC, 5), None +- LocalTableScan [a#92,b#93], [[1,2],[1,2]] ``` After the fix, the plan is like ``` == Analyzed Logical Plan == a: int, b: int Sort [(cast(null as int) + 5) ASC], true +- Project [a#92,b#93] +- SubqueryAlias t +- Project [_1#89 AS a#92,_2#90 AS b#93] +- LocalRelation [_1#89,_2#90], [[1,2],[1,2]] == Optimized Logical Plan == LocalRelation [a#92,b#93], [[1,2],[1,2]] == Physical Plan == LocalTableScan [a#92,b#93], [[1,2],[1,2]] ``` cc rxin cloud-fan marmbrus Thanks! #### How was this patch tested? Added a test suite for covering this rule Author: gatorsmile <gatorsmile@gmail.com> Closes #11840 from gatorsmile/sortElimination.
Diffstat (limited to 'python/pyspark/sql')
0 files changed, 0 insertions, 0 deletions