From f25bbbdb3ac5620850c7d09d6a63af888411ecf1 Mon Sep 17 00:00:00 2001
From: Reynold Xin <rxin@apache.org>
Date: Sun, 7 Sep 2014 20:42:07 -0700
Subject: [SPARK-3280] Made sort-based shuffle the default implementation

Sort-based shuffle has lower memory usage and seems to outperform hash-based in almost all of our testing.

Author: Reynold Xin <rxin@apache.org>

Closes #2178 from rxin/sort-shuffle and squashes the following commits:

713d341 [Reynold Xin] Fixed test failures by setting spark.shuffle.compress to the same value as spark.shuffle.spill.compress.
85165e6 [Reynold Xin] Fixed a comment typo.
aa0d372 [Reynold Xin] [SPARK-3280] Made sort-based shuffle the default implementation
---
 docs/configuration.md | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

(limited to 'docs/configuration.md')
diff --git a/docs/configuration.md b/docs/configuration.md
index 65a422caab..36178efb97 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -293,12 +293,11 @@ Apart from these, the following properties are also available, and may be useful
 </tr>
 <tr>
   <td><code>spark.shuffle.manager</code></td>
-  <td>HASH</td>
+  <td>sort</td>
   <td>
-    Implementation to use for shuffling data. A hash-based shuffle manager is the default, but
-    starting in Spark 1.1 there is an experimental sort-based shuffle manager that is more 
-    memory-efficient in environments with small executors, such as YARN. To use that, change
-    this value to <code>SORT</code>.
+    Implementation to use for shuffling data. There are two implementations available:
+    <code>sort</code> and <code>hash</code>. Sort-based shuffle is more memory-efficient and is
+    the default option starting in 1.2.
   </td>
 </tr>
 <tr>
-- 
cgit v1.2.3