diff options
author | Nong Li <nong@databricks.com> | 2015-11-10 11:28:53 -0800 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2015-11-10 11:28:53 -0800 |
commit | 87aedc48c01dffbd880e6ca84076ed47c68f88d0 (patch) | |
tree | 03427b95d0f7032722373fdd05b5f07b6361d7e0 /R/pkg/inst | |
parent | 53600854c270d4c953fe95fbae528740b5cf6603 (diff) | |
download | spark-87aedc48c01dffbd880e6ca84076ed47c68f88d0.tar.gz spark-87aedc48c01dffbd880e6ca84076ed47c68f88d0.tar.bz2 spark-87aedc48c01dffbd880e6ca84076ed47c68f88d0.zip |
[SPARK-10371][SQL] Implement subexpr elimination for UnsafeProjections
This patch adds the building blocks for codegening subexpr elimination and implements
it end to end for UnsafeProjection. The building blocks can be used to do the same thing
for other operators.
It introduces some utilities to compute common sub expressions. Expressions can be added to
this data structure. The expr and its children will be recursively matched against existing
expressions (ones previously added) and grouped into common groups. This is built using
the existing `semanticEquals`. It does not understand things like commutative or associative
expressions. This can be done as future work.
After building this data structure, the codegen process takes advantage of it by:
1. Generating a helper function in the generated class that computes the common
subexpression. This is done for all common subexpressions that have at least
two occurrences and the expression tree is sufficiently complex.
2. When generating the apply() function, if the helper function exists, call that
instead of regenerating the expression tree. Repeated calls to the helper function
shortcircuit the evaluation logic.
Author: Nong Li <nong@databricks.com>
Author: Nong Li <nongli@gmail.com>
This patch had conflicts when merged, resolved by
Committer: Michael Armbrust <michael@databricks.com>
Closes #9480 from nongli/spark-10371.
Diffstat (limited to 'R/pkg/inst')
0 files changed, 0 insertions, 0 deletions