[SPARK-14796][SQL] Add spark.sql.optimizer.inSetConversionThreshold config option. - spark

diff options

author	Dongjoon Hyun <dongjoon@apache.org>	2016-04-22 14:14:47 -0700
committer	Reynold Xin <rxin@databricks.com>	2016-04-22 14:14:47 -0700
commit	3647120a5a879edf3a96a5fd68fb7aa849ad57ef (patch)
tree	6725ba31694bba605fa7d5fdce4c135b4821367b /streaming
parent	0dcf9dbebbd53aaebe17c85ede7ab7847ce83137 (diff)
download	spark-3647120a5a879edf3a96a5fd68fb7aa849ad57ef.tar.gz spark-3647120a5a879edf3a96a5fd68fb7aa849ad57ef.tar.bz2 spark-3647120a5a879edf3a96a5fd68fb7aa849ad57ef.zip

[SPARK-14796][SQL] Add spark.sql.optimizer.inSetConversionThreshold config option.

## What changes were proposed in this pull request? Currently, `OptimizeIn` optimizer replaces `In` expression into `InSet` expression if the size of set is greater than a constant, 10. This issue aims to make a configuration `spark.sql.optimizer.inSetConversionThreshold` for that. After this PR, `OptimizerIn` is configurable. ```scala scala> sql("select a in (1,2,3) from (select explode(array(1,2)) a) T").explain() == Physical Plan == WholeStageCodegen : +- Project [a#7 IN (1,2,3) AS (a IN (1, 2, 3))#8] : +- INPUT +- Generate explode([1,2]), false, false, [a#7] +- Scan OneRowRelation[] scala> sqlContext.setConf("spark.sql.optimizer.inSetConversionThreshold", "2") scala> sql("select a in (1,2,3) from (select explode(array(1,2)) a) T").explain() == Physical Plan == WholeStageCodegen : +- Project [a#16 INSET (1,2,3) AS (a IN (1, 2, 3))#17] : +- INPUT +- Generate explode([1,2]), false, false, [a#16] +- Scan OneRowRelation[] ``` ## How was this patch tested? Pass the Jenkins tests (with a new testcase) Author: Dongjoon Hyun <dongjoon@apache.org> Closes #12562 from dongjoon-hyun/SPARK-14796.

Diffstat (limited to 'streaming')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: