aboutsummaryrefslogtreecommitdiff
path: root/R
diff options
context:
space:
mode:
authorwangzhenhua <wangzhenhua@huawei.com>2017-03-06 21:45:36 -0800
committerXiao Li <gatorsmile@gmail.com>2017-03-06 21:45:36 -0800
commit9909f6d361fdf2b7ef30fa7fbbc91e00f2999794 (patch)
tree4c8e52db7af59664f066bd3b06ff93d576dccecc /R
parentb0a5cd89097c563e9949d8cfcf84d18b03b8d24c (diff)
downloadspark-9909f6d361fdf2b7ef30fa7fbbc91e00f2999794.tar.gz
spark-9909f6d361fdf2b7ef30fa7fbbc91e00f2999794.tar.bz2
spark-9909f6d361fdf2b7ef30fa7fbbc91e00f2999794.zip
[SPARK-19350][SQL] Cardinality estimation of Limit and Sample
## What changes were proposed in this pull request? Before this pr, LocalLimit/GlobalLimit/Sample propagates the same row count and column stats from its child, which is incorrect. We can get the correct rowCount in Statistics for GlobalLimit/Sample whether cbo is enabled or not. We don't know the rowCount for LocalLimit because we don't know the partition number at that time. Column stats should not be propagated because we don't know the distribution of columns after Limit or Sample. ## How was this patch tested? Added test cases. Author: wangzhenhua <wangzhenhua@huawei.com> Closes #16696 from wzhfy/limitEstimation.
Diffstat (limited to 'R')
0 files changed, 0 insertions, 0 deletions