diff options
author | Brian Cho <bcho@fb.com> | 2016-07-24 19:36:58 -0700 |
---|---|---|
committer | Josh Rosen <joshrosen@databricks.com> | 2016-07-24 19:36:58 -0700 |
commit | daace6014216b996bcc8937f1fdcea732b6910ca (patch) | |
tree | ae328f4d9fb1e11cc0034ed26665082dd92507a8 /graphx | |
parent | 1221ce04029154778ccb5453e348f6d116092cc5 (diff) | |
download | spark-daace6014216b996bcc8937f1fdcea732b6910ca.tar.gz spark-daace6014216b996bcc8937f1fdcea732b6910ca.tar.bz2 spark-daace6014216b996bcc8937f1fdcea732b6910ca.zip |
[SPARK-5581][CORE] When writing sorted map output file, avoid open / …
…close between each partition
## What changes were proposed in this pull request?
Replace commitAndClose with separate commit and close to avoid opening and closing
the file between partitions.
## How was this patch tested?
Run existing unit tests, add a few unit tests regarding reverts.
Observed a ~20% reduction in total time in tasks on stages with shuffle
writes to many partitions.
JoshRosen
Author: Brian Cho <bcho@fb.com>
Closes #13382 from dafrista/separatecommit-master.
Diffstat (limited to 'graphx')
0 files changed, 0 insertions, 0 deletions