aboutsummaryrefslogtreecommitdiff
path: root/sql/core
diff options
context:
space:
mode:
authorLu Yan <luyan02@baidu.com>2015-02-09 16:25:38 -0800
committerMichael Armbrust <michael@databricks.com>2015-02-09 16:25:38 -0800
commit2a36292534a1e9f7a501e88f69bfc3a09fb62cb3 (patch)
tree46697523464994edea831e879f6f95286a540c36 /sql/core
parentb8080aa86d55e0467fd4328f10a2f0d6605e6cc6 (diff)
downloadspark-2a36292534a1e9f7a501e88f69bfc3a09fb62cb3.tar.gz
spark-2a36292534a1e9f7a501e88f69bfc3a09fb62cb3.tar.bz2
spark-2a36292534a1e9f7a501e88f69bfc3a09fb62cb3.zip
[SPARK-5614][SQL] Predicate pushdown through Generate.
Now in Catalyst's rules, predicates can not be pushed through "Generate" nodes. Further more, partition pruning in HiveTableScan can not be applied on those queries involves "Generate". This makes such queries very inefficient. In practice, it finds patterns like ```scala Filter(predicate, Generate(generator, _, _, _, grandChild)) ``` and splits the predicate into 2 parts by referencing the generated column from Generate node or not. And a new Filter will be created for those conjuncts can be pushed beneath Generate node. If nothing left for the original Filter, it will be removed. For example, physical plan for query ```sql select len, bk from s_server lateral view explode(len_arr) len_table as len where len > 5 and day = '20150102'; ``` where 'day' is a partition column in metastore is like this in current version of Spark SQL: > Project [len, bk] > > Filter ((len > "5") && "(day = "20150102")") > > Generate explode(len_arr), true, false > > HiveTableScan [bk, len_arr, day], (MetastoreRelation default, s_server, None), None But theoretically the plan should be like this > Project [len, bk] > > Filter (len > "5") > > Generate explode(len_arr), true, false > > HiveTableScan [bk, len_arr, day], (MetastoreRelation default, s_server, None), Some(day = "20150102") Where partition pruning predicates can be pushed to HiveTableScan nodes. Author: Lu Yan <luyan02@baidu.com> Closes #4394 from ianluyan/ppd and squashes the following commits: a67dce9 [Lu Yan] Fix English grammar. 7cea911 [Lu Yan] Revised based on @marmbrus's opinions ffc59fc [Lu Yan] [SPARK-5614][SQL] Predicate pushdown through Generate.
Diffstat (limited to 'sql/core')
0 files changed, 0 insertions, 0 deletions