diff options
author | CrazyJvm <crazyjvm@gmail.com> | 2014-08-01 11:46:13 -0700 |
---|---|---|
committer | Patrick Wendell <pwendell@gmail.com> | 2014-08-01 11:46:14 -0700 |
commit | c82fe4781cd0356bcfdd25c7eadf1da624bb2228 (patch) | |
tree | f66df121d4d678c98c1a4626f2393c6f7fa126bc /docs | |
parent | c0b47bada3c9f0e9e0f14ab41ffb91012a357211 (diff) | |
download | spark-c82fe4781cd0356bcfdd25c7eadf1da624bb2228.tar.gz spark-c82fe4781cd0356bcfdd25c7eadf1da624bb2228.tar.bz2 spark-c82fe4781cd0356bcfdd25c7eadf1da624bb2228.zip |
[SQL] Documentation: Explain cacheTable command
add the `cacheTable` specification
Author: CrazyJvm <crazyjvm@gmail.com>
Closes #1681 from CrazyJvm/sql-programming-guide-cache and squashes the following commits:
0a231e0 [CrazyJvm] grammar fixes
a04020e [CrazyJvm] modify title to Cached tables
18b6594 [CrazyJvm] fix format
2cbbf58 [CrazyJvm] add cacheTable guide
Diffstat (limited to 'docs')
-rw-r--r-- | docs/sql-programming-guide.md | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index a047d32b6e..7261badd41 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -769,3 +769,13 @@ To start the Spark SQL CLI, run the following in the Spark directory: Configuration of Hive is done by placing your `hive-site.xml` file in `conf/`. You may run `./bin/spark-sql --help` for a complete list of all available options. + +# Cached tables + +Spark SQL can cache tables using an in-memory columnar format by calling `cacheTable("tableName")`. +Then Spark SQL will scan only required columns and will automatically tune compression to minimize +memory usage and GC pressure. You can call `uncacheTable("tableName")` to remove the table from memory. + +Note that if you just call `cache` rather than `cacheTable`, tables will _not_ be cached in +in-memory columnar format. So we strongly recommend using `cacheTable` whenever you want to +cache tables. |