aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorMichael Armbrust <michael@databricks.com>2016-02-22 15:27:29 -0800
committerMichael Armbrust <michael@databricks.com>2016-02-22 15:27:29 -0800
commit173aa949c309ff7a7a03e9d762b9108542219a95 (patch)
tree8dc2978ccaa7c4011aeaeb5c358a49f055a44ef6 /README.md
parent4a91806a45a48432c3ea4c2aaa553177952673e9 (diff)
downloadspark-173aa949c309ff7a7a03e9d762b9108542219a95.tar.gz
spark-173aa949c309ff7a7a03e9d762b9108542219a95.tar.bz2
spark-173aa949c309ff7a7a03e9d762b9108542219a95.zip
[SPARK-12546][SQL] Change default number of open parquet files
A common problem that users encounter with Spark 1.6.0 is that writing to a partitioned parquet table OOMs. The root cause is that parquet allocates a significant amount of memory that is not accounted for by our own mechanisms. As a workaround, we can ensure that only a single file is open per task unless the user explicitly asks for more. Author: Michael Armbrust <michael@databricks.com> Closes #11308 from marmbrus/parquetWriteOOM.
Diffstat (limited to 'README.md')
0 files changed, 0 insertions, 0 deletions