aboutsummaryrefslogtreecommitdiff
path: root/docs/graphx-programming-guide.md
diff options
context:
space:
mode:
authorMichael Armbrust <michael@databricks.com>2014-11-03 14:08:27 -0800
committerMichael Armbrust <michael@databricks.com>2014-11-03 14:08:27 -0800
commit25bef7e6951301e93004567fc0cef96bf8d1a224 (patch)
tree73941695b30cb7cdf96c9805935697162c578b14 /docs/graphx-programming-guide.md
parente83f13e8d37ca33f4e183e977d077221b90c6025 (diff)
downloadspark-25bef7e6951301e93004567fc0cef96bf8d1a224.tar.gz
spark-25bef7e6951301e93004567fc0cef96bf8d1a224.tar.bz2
spark-25bef7e6951301e93004567fc0cef96bf8d1a224.zip
[SQL] More aggressive defaults
- Turns on compression for in-memory cached data by default - Changes the default parquet compression format back to gzip (we have seen more OOMs with production workloads due to the way Snappy allocates memory) - Ups the batch size to 10,000 rows - Increases the broadcast threshold to 10mb. - Uses our parquet implementation instead of the hive one by default. - Cache parquet metadata by default. Author: Michael Armbrust <michael@databricks.com> Closes #3064 from marmbrus/fasterDefaults and squashes the following commits: 97ee9f8 [Michael Armbrust] parquet codec docs e641694 [Michael Armbrust] Remote also a12866a [Michael Armbrust] Cache metadata. 2d73acc [Michael Armbrust] Update docs defaults. d63d2d5 [Michael Armbrust] document parquet option da373f9 [Michael Armbrust] More aggressive defaults
Diffstat (limited to 'docs/graphx-programming-guide.md')
0 files changed, 0 insertions, 0 deletions