aboutsummaryrefslogtreecommitdiff
path: root/streaming
diff options
context:
space:
mode:
authorDongjoon Hyun <dongjoon@apache.org>2016-04-22 14:14:47 -0700
committerReynold Xin <rxin@databricks.com>2016-04-22 14:14:47 -0700
commit3647120a5a879edf3a96a5fd68fb7aa849ad57ef (patch)
tree6725ba31694bba605fa7d5fdce4c135b4821367b /streaming
parent0dcf9dbebbd53aaebe17c85ede7ab7847ce83137 (diff)
downloadspark-3647120a5a879edf3a96a5fd68fb7aa849ad57ef.tar.gz
spark-3647120a5a879edf3a96a5fd68fb7aa849ad57ef.tar.bz2
spark-3647120a5a879edf3a96a5fd68fb7aa849ad57ef.zip
[SPARK-14796][SQL] Add spark.sql.optimizer.inSetConversionThreshold config option.
## What changes were proposed in this pull request? Currently, `OptimizeIn` optimizer replaces `In` expression into `InSet` expression if the size of set is greater than a constant, 10. This issue aims to make a configuration `spark.sql.optimizer.inSetConversionThreshold` for that. After this PR, `OptimizerIn` is configurable. ```scala scala> sql("select a in (1,2,3) from (select explode(array(1,2)) a) T").explain() == Physical Plan == WholeStageCodegen : +- Project [a#7 IN (1,2,3) AS (a IN (1, 2, 3))#8] : +- INPUT +- Generate explode([1,2]), false, false, [a#7] +- Scan OneRowRelation[] scala> sqlContext.setConf("spark.sql.optimizer.inSetConversionThreshold", "2") scala> sql("select a in (1,2,3) from (select explode(array(1,2)) a) T").explain() == Physical Plan == WholeStageCodegen : +- Project [a#16 INSET (1,2,3) AS (a IN (1, 2, 3))#17] : +- INPUT +- Generate explode([1,2]), false, false, [a#16] +- Scan OneRowRelation[] ``` ## How was this patch tested? Pass the Jenkins tests (with a new testcase) Author: Dongjoon Hyun <dongjoon@apache.org> Closes #12562 from dongjoon-hyun/SPARK-14796.
Diffstat (limited to 'streaming')
0 files changed, 0 insertions, 0 deletions