aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql
diff options
context:
space:
mode:
authorgatorsmile <gatorsmile@gmail.com>2016-02-19 22:27:10 -0800
committerDavies Liu <davies.liu@gmail.com>2016-02-19 22:27:10 -0800
commitec7a1d6e425509f2472c3ae9497c7da796ce8129 (patch)
tree75671732b5e2ef5d0e2e0ab07a590353a89b8726 /python/pyspark/sql
parent983fa2d62029e7334fb661cb65c8cadaa4b86d1c (diff)
downloadspark-ec7a1d6e425509f2472c3ae9497c7da796ce8129.tar.gz
spark-ec7a1d6e425509f2472c3ae9497c7da796ce8129.tar.bz2
spark-ec7a1d6e425509f2472c3ae9497c7da796ce8129.zip
[SPARK-12594] [SQL] Outer Join Elimination by Filter Conditions
Conversion of outer joins, if the predicates in filter conditions can restrict the result sets so that all null-supplying rows are eliminated. - `full outer` -> `inner` if both sides have such predicates - `left outer` -> `inner` if the right side has such predicates - `right outer` -> `inner` if the left side has such predicates - `full outer` -> `left outer` if only the left side has such predicates - `full outer` -> `right outer` if only the right side has such predicates If applicable, this can greatly improve the performance, since outer join is much slower than inner join, full outer join is much slower than left/right outer join. The original PR is https://github.com/apache/spark/pull/10542 Author: gatorsmile <gatorsmile@gmail.com> Author: xiaoli <lixiao1983@gmail.com> Author: Xiao Li <xiaoli@Xiaos-MacBook-Pro.local> Closes #10567 from gatorsmile/outerJoinEliminationByFilterCond.
Diffstat (limited to 'python/pyspark/sql')
0 files changed, 0 insertions, 0 deletions