aboutsummaryrefslogtreecommitdiff
path: root/yarn/src
diff options
context:
space:
mode:
authorDavies Liu <davies@databricks.com>2016-06-23 11:48:48 -0700
committerDavies Liu <davies.liu@gmail.com>2016-06-23 11:48:48 -0700
commit10396d9505c752cc18b6424f415d4ff0f460ad65 (patch)
treeb2f45ba2c96e182d2fad139d022651dbfe88494a /yarn/src
parent60398dabc50d402bbab4190fbe94ebed6d3a48dc (diff)
downloadspark-10396d9505c752cc18b6424f415d4ff0f460ad65.tar.gz
spark-10396d9505c752cc18b6424f415d4ff0f460ad65.tar.bz2
spark-10396d9505c752cc18b6424f415d4ff0f460ad65.zip
[SPARK-16163] [SQL] Cache the statistics for logical plans
## What changes were proposed in this pull request? This calculation of statistics is not trivial anymore, it could be very slow on large query (for example, TPC-DS Q64 took several minutes to plan). During the planning of a query, the statistics of any logical plan should not change (even InMemoryRelation), so we should use `lazy val` to cache the statistics. For InMemoryRelation, the statistics could be updated after materialization, it's only useful when used in another query (before planning), because once we finished the planning, the statistics will not be used anymore. ## How was this patch tested? Testsed with TPC-DS Q64, it could be planned in a second after the patch. Author: Davies Liu <davies@databricks.com> Closes #13871 from davies/fix_statistics.
Diffstat (limited to 'yarn/src')
0 files changed, 0 insertions, 0 deletions