diff options
author | Cheng Lian <lian@databricks.com> | 2014-10-26 16:10:09 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2014-10-26 16:10:09 -0700 |
commit | 2838bf8aadd5228829c1a869863bc4da7877fdfb (patch) | |
tree | 474e9dc739631b81c20c812c38413d969fe47f2c /project/plugins.sbt | |
parent | 879a16585808e8fe34bdede741565efc4c9f9bb3 (diff) | |
download | spark-2838bf8aadd5228829c1a869863bc4da7877fdfb.tar.gz spark-2838bf8aadd5228829c1a869863bc4da7877fdfb.tar.bz2 spark-2838bf8aadd5228829c1a869863bc4da7877fdfb.zip |
[SPARK-3537][SPARK-3914][SQL] Refines in-memory columnar table statistics
This PR refines in-memory columnar table statistics:
1. adds 2 more statistics for in-memory table columns: `count` and `sizeInBytes`
1. adds filter pushdown support for `IS NULL` and `IS NOT NULL`.
1. caches and propagates statistics in `InMemoryRelation` once the underlying cached RDD is materialized.
Statistics are collected to driver side with an accumulator.
This PR also fixes SPARK-3914 by properly propagating in-memory statistics.
Author: Cheng Lian <lian@databricks.com>
Closes #2860 from liancheng/propagates-in-mem-stats and squashes the following commits:
0cc5271 [Cheng Lian] Restricts visibility of o.a.s.s.c.p.l.Statistics
c5ff904 [Cheng Lian] Fixes test table name conflict
a8c818d [Cheng Lian] Refines tests
1d01074 [Cheng Lian] Bug fix: shouldn't call STRING.actualSize on null string value
7dc6a34 [Cheng Lian] Adds more in-memory table statistics and propagates them properly
Diffstat (limited to 'project/plugins.sbt')
0 files changed, 0 insertions, 0 deletions