diff options
author | Sital Kedia <skedia@fb.com> | 2016-06-25 09:13:39 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-06-25 09:13:39 +0100 |
commit | bf665a958631125a1670504ef5966ef1a0e14798 (patch) | |
tree | 4bdefeed5732c74e577c272bb9d2651cc990dcce /sql/catalyst | |
parent | a3c7b4187bad00dad87df7e3b5929a44d29568ed (diff) | |
download | spark-bf665a958631125a1670504ef5966ef1a0e14798.tar.gz spark-bf665a958631125a1670504ef5966ef1a0e14798.tar.bz2 spark-bf665a958631125a1670504ef5966ef1a0e14798.zip |
[SPARK-15958] Make initial buffer size for the Sorter configurable
## What changes were proposed in this pull request?
Currently the initial buffer size in the sorter is hard coded inside the code and is too small for large workload. As a result, the sorter spends significant time expanding the buffer size and copying the data. It would be useful to have it configurable.
## How was this patch tested?
Tested by running a job on the cluster.
Author: Sital Kedia <skedia@fb.com>
Closes #13699 from sitalkedia/config_sort_buffer_upstream.
Diffstat (limited to 'sql/catalyst')
-rw-r--r-- | sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java b/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java index ad76bf5a0a..0b177ad411 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java @@ -38,6 +38,7 @@ import org.apache.spark.util.collection.unsafe.sort.UnsafeSorterIterator; public final class UnsafeExternalRowSorter { + static final int DEFAULT_INITIAL_SORT_BUFFER_SIZE = 4096; /** * If positive, forces records to be spilled to disk at the given frequency (measured in numbers * of records). This is only intended to be used in tests. @@ -85,7 +86,8 @@ public final class UnsafeExternalRowSorter { taskContext, new RowComparator(ordering, schema.length()), prefixComparator, - /* initialSize */ 4096, + sparkEnv.conf().getInt("spark.shuffle.sort.initialBufferSize", + DEFAULT_INITIAL_SORT_BUFFER_SIZE), pageSizeBytes, canUseRadixSort ); |