diff options
author | Cheng Hao <hao.cheng@intel.com> | 2014-05-15 22:12:34 -0700 |
---|---|---|
committer | Reynold Xin <rxin@apache.org> | 2014-05-15 22:12:49 -0700 |
commit | eac4ee89021b3929d129c94a3116040e9281a636 (patch) | |
tree | 4e4d26afb4069380b3fae76cda36582bd684ffe9 /streaming | |
parent | 54414716ba9d3f02cfcaccf292d6254783617f78 (diff) | |
download | spark-eac4ee89021b3929d129c94a3116040e9281a636.tar.gz spark-eac4ee89021b3929d129c94a3116040e9281a636.tar.bz2 spark-eac4ee89021b3929d129c94a3116040e9281a636.zip |
[Spark-1461] Deferred Expression Evaluation (short-circuit evaluation)
This patch unify the foldable & nullable interface for Expression.
1) Deterministic-less UDF (like Rand()) can not be folded.
2) Short-circut will significantly improves the performance in Expression Evaluation, however, the stateful UDF should not be ignored in a short-circuit evaluation(e.g. in expression: col1 > 0 and row_sequence() < 1000, row_sequence() can not be ignored even if col1 > 0 is false)
I brought an concept of DeferredObject from Hive, which has 2 kinds of children classes (EagerResult / DeferredResult), the former requires triggering the evaluation before it's created, while the later trigger the evaluation when first called its get() method.
Author: Cheng Hao <hao.cheng@intel.com>
Closes #446 from chenghao-intel/expression_deferred_evaluation and squashes the following commits:
d2729de [Cheng Hao] Fix the codestyle issues
a08f09c [Cheng Hao] fix bug in or/and short-circuit evaluation
af2236b [Cheng Hao] revert the short-circuit expression evaluation for IF
b7861d2 [Cheng Hao] Add Support for Deferred Expression Evaluation
(cherry picked from commit a20fea98811d98958567780815fcf0d4fb4e28d4)
Signed-off-by: Reynold Xin <rxin@apache.org>
Diffstat (limited to 'streaming')
0 files changed, 0 insertions, 0 deletions