aboutsummaryrefslogtreecommitdiff
path: root/mllib/src/main/scala/org
diff options
context:
space:
mode:
authorDongjoon Hyun <dongjoon@apache.org>2016-04-14 13:34:29 -0700
committerReynold Xin <rxin@databricks.com>2016-04-14 13:34:29 -0700
commitd7e124edfe2578ecdf8e816a4dda3ce430a09172 (patch)
treee7a6dc3bbc06803b10c183977d3588383039b01d /mllib/src/main/scala/org
parentbc748b7b8f3b5aee28aff9ea078c216ca137a5b7 (diff)
downloadspark-d7e124edfe2578ecdf8e816a4dda3ce430a09172.tar.gz
spark-d7e124edfe2578ecdf8e816a4dda3ce430a09172.tar.bz2
spark-d7e124edfe2578ecdf8e816a4dda3ce430a09172.zip
[SPARK-14545][SQL] Improve `LikeSimplification` by adding `a%b` rule
## What changes were proposed in this pull request? Current `LikeSimplification` handles the following four rules. - 'a%' => expr.StartsWith("a") - '%b' => expr.EndsWith("b") - '%a%' => expr.Contains("a") - 'a' => EqualTo("a") This PR adds the following rule. - 'a%b' => expr.Length() >= 2 && expr.StartsWith("a") && expr.EndsWith("b") Here, 2 is statically calculated from "a".size + "b".size. **Before** ``` scala> sql("select a from (select explode(array('abc','adc')) a) T where a like 'a%c'").explain() == Physical Plan == WholeStageCodegen : +- Filter a#5 LIKE a%c : +- INPUT +- Generate explode([abc,adc]), false, false, [a#5] +- Scan OneRowRelation[] ``` **After** ``` scala> sql("select a from (select explode(array('abc','adc')) a) T where a like 'a%c'").explain() == Physical Plan == WholeStageCodegen : +- Filter ((length(a#5) >= 2) && (StartsWith(a#5, a) && EndsWith(a#5, c))) : +- INPUT +- Generate explode([abc,adc]), false, false, [a#5] +- Scan OneRowRelation[] ``` ## How was this patch tested? Pass the Jenkins tests (including new testcase). Author: Dongjoon Hyun <dongjoon@apache.org> Closes #12312 from dongjoon-hyun/SPARK-14545.
Diffstat (limited to 'mllib/src/main/scala/org')
0 files changed, 0 insertions, 0 deletions