diff options
author | Cheng Lian <lian@databricks.com> | 2016-03-22 19:20:56 +0800 |
---|---|---|
committer | Cheng Lian <lian@databricks.com> | 2016-03-22 19:20:56 +0800 |
commit | f2e855fba8eb73475cf312cdf880c1297d4323bb (patch) | |
tree | 2573c5f19c05979757deb990abbbb10dc6c2be6c /sql/catalyst/src | |
parent | 14464cadb9477be8b7f4c891ea990535ab6638ec (diff) | |
download | spark-f2e855fba8eb73475cf312cdf880c1297d4323bb.tar.gz spark-f2e855fba8eb73475cf312cdf880c1297d4323bb.tar.bz2 spark-f2e855fba8eb73475cf312cdf880c1297d4323bb.zip |
[SPARK-13473][SQL] Simplifies PushPredicateThroughProject
## What changes were proposed in this pull request?
This is a follow-up of PR #11348.
After PR #11348, a predicate is never pushed through a project as long as the project contains any non-deterministic fields. Thus, it's impossible that the candidate filter condition can reference any non-deterministic projected fields, and related logic can be safely cleaned up.
To be more specific, the following optimization is allowed:
```scala
// From:
df.select('a, 'b).filter('c > rand(42))
// To:
df.filter('c > rand(42)).select('a, 'b)
```
while this isn't:
```scala
// From:
df.select('a, rand('b) as 'rb, 'c).filter('c > 'rb)
// To:
df.filter('c > rand('b)).select('a, rand('b) as 'rb, 'c)
```
## How was this patch tested?
Existing test cases should do the work.
Author: Cheng Lian <lian@databricks.com>
Closes #11864 from liancheng/spark-13473-cleanup.
Diffstat (limited to 'sql/catalyst/src')
-rw-r--r-- | sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala | 24 |
1 files changed, 1 insertions, 23 deletions
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala index 41e8dc0f46..0840d46e4e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala @@ -891,29 +891,7 @@ object PushPredicateThroughProject extends Rule[LogicalPlan] with PredicateHelpe case a: Alias => (a.toAttribute, a.child) }) - // Split the condition into small conditions by `And`, so that we can push down part of this - // condition without nondeterministic expressions. - val andConditions = splitConjunctivePredicates(condition) - - val (deterministic, nondeterministic) = andConditions.partition(_.collect { - case a: Attribute if aliasMap.contains(a) => aliasMap(a) - }.forall(_.deterministic)) - - // If there is no nondeterministic conditions, push down the whole condition. - if (nondeterministic.isEmpty) { - project.copy(child = Filter(replaceAlias(condition, aliasMap), grandChild)) - } else { - // If they are all nondeterministic conditions, leave it un-changed. - if (deterministic.isEmpty) { - filter - } else { - // Push down the small conditions without nondeterministic expressions. - val pushedCondition = - deterministic.map(replaceAlias(_, aliasMap)).reduce(And) - Filter(nondeterministic.reduce(And), - project.copy(child = Filter(pushedCondition, grandChild))) - } - } + project.copy(child = Filter(replaceAlias(condition, aliasMap), grandChild)) } } |