diff options
author | Sital Kedia <skedia@fb.com> | 2016-06-30 10:53:18 -0700 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2016-06-30 10:53:18 -0700 |
commit | 07f46afc733b1718d528a6ea5c0d774f047024fa (patch) | |
tree | d150b6afc44dfdc05e57ebaba4136b0c0e4cf063 /sql/catalyst | |
parent | 5344bade8efb6f12aa43fbfbbbc2e3c0c7d16d98 (diff) | |
download | spark-07f46afc733b1718d528a6ea5c0d774f047024fa.tar.gz spark-07f46afc733b1718d528a6ea5c0d774f047024fa.tar.bz2 spark-07f46afc733b1718d528a6ea5c0d774f047024fa.zip |
[SPARK-13850] Force the sorter to Spill when number of elements in th…
## What changes were proposed in this pull request?
Force the sorter to Spill when number of elements in the pointer array reach a certain size. This is to workaround the issue of timSort failing on large buffer size.
## How was this patch tested?
Tested by running a job which was failing without this change due to TimSort bug.
Author: Sital Kedia <skedia@fb.com>
Closes #13107 from sitalkedia/fix_TimSort.
Diffstat (limited to 'sql/catalyst')
-rw-r--r-- | sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java b/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java index 0b177ad411..b4e87c3035 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java @@ -89,6 +89,8 @@ public final class UnsafeExternalRowSorter { sparkEnv.conf().getInt("spark.shuffle.sort.initialBufferSize", DEFAULT_INITIAL_SORT_BUFFER_SIZE), pageSizeBytes, + SparkEnv.get().conf().getLong("spark.shuffle.spill.numElementsForceSpillThreshold", UnsafeExternalSorter + .DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD), canUseRadixSort ); } |