aboutsummaryrefslogtreecommitdiff
path: root/docs/configuration.md
diff options
context:
space:
mode:
authorReynold Xin <rxin@apache.org>2014-09-07 20:42:07 -0700
committerReynold Xin <rxin@apache.org>2014-09-07 20:42:07 -0700
commitf25bbbdb3ac5620850c7d09d6a63af888411ecf1 (patch)
treec0af880726ce3e961dd59c8f76e36ee364569460 /docs/configuration.md
parent4ba2673569f8c6da7f7348977f52f98f40dfbfec (diff)
downloadspark-f25bbbdb3ac5620850c7d09d6a63af888411ecf1.tar.gz
spark-f25bbbdb3ac5620850c7d09d6a63af888411ecf1.tar.bz2
spark-f25bbbdb3ac5620850c7d09d6a63af888411ecf1.zip
[SPARK-3280] Made sort-based shuffle the default implementation
Sort-based shuffle has lower memory usage and seems to outperform hash-based in almost all of our testing. Author: Reynold Xin <rxin@apache.org> Closes #2178 from rxin/sort-shuffle and squashes the following commits: 713d341 [Reynold Xin] Fixed test failures by setting spark.shuffle.compress to the same value as spark.shuffle.spill.compress. 85165e6 [Reynold Xin] Fixed a comment typo. aa0d372 [Reynold Xin] [SPARK-3280] Made sort-based shuffle the default implementation
Diffstat (limited to 'docs/configuration.md')
-rw-r--r--docs/configuration.md9
1 files changed, 4 insertions, 5 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 65a422caab..36178efb97 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -293,12 +293,11 @@ Apart from these, the following properties are also available, and may be useful
</tr>
<tr>
<td><code>spark.shuffle.manager</code></td>
- <td>HASH</td>
+ <td>sort</td>
<td>
- Implementation to use for shuffling data. A hash-based shuffle manager is the default, but
- starting in Spark 1.1 there is an experimental sort-based shuffle manager that is more
- memory-efficient in environments with small executors, such as YARN. To use that, change
- this value to <code>SORT</code>.
+ Implementation to use for shuffling data. There are two implementations available:
+ <code>sort</code> and <code>hash</code>. Sort-based shuffle is more memory-efficient and is
+ the default option starting in 1.2.
</td>
</tr>
<tr>