SPARK-2787: Make sort-based shuffle write files directly when there's no sorting/aggregation and # partitions is small

As described in https://issues.apache.org/jira/browse/SPARK-2787, right now sort-based shuffle is more expensive than hash-based for map operations that do no partial aggregation or sorting, such as groupByKey. This is because it has to serialize each data item twice (once when spilling to intermediate files, and then again when merging these files object-by-object). This patch adds a code path to just write separate files directly if the # of output partitions is small, and concatenate them at the end to produce a sorted file. On the unit test side, I added some tests that force or don't force this bypass path to be used, and checked that our tests for other features (e.g. all the operations) cover both cases. Author: Matei Zaharia <matei@databricks.com> Closes #1799 from mateiz/SPARK-2787 and squashes the following commits: 88cf26a [Matei Zaharia] Fix rebase 10233af [Matei Zaharia] Review comments 398cb95 [Matei Zaharia] Fix looking up shuffle manager in conf ca3efd9 [Matei Zaharia] Add docs for shuffle manager properties, and allow short names for them d0ae3c5 [Matei Zaharia] Fix some comments 90d084f [Matei Zaharia] Add code path to bypass merge-sort in ExternalSorter, and tests 31e5d7c [Matei Zaharia] Move existing logic for writing partitioned files into ExternalSorter
author: Matei Zaharia <matei@databricks.com> 2014-08-07 18:04:49 -0700
committer: Reynold Xin <rxin@apache.org> 2014-08-07 18:04:49 -0700
commit: 6906b69cf568015f20c7d7c77cbcba650e5431a9 (patch)
tree: fbec9283bed82e0b2f0baa8c25834f3a54fac5ba /docs/configuration.md
parent: 32096c2aed9978cfb9a904b4f56bb61800d17e9e (diff)
download: spark-6906b69cf568015f20c7d7c77cbcba650e5431a9.tar.gz
spark-6906b69cf568015f20c7d7c77cbcba650e5431a9.tar.bz2
spark-6906b69cf568015f20c7d7c77cbcba650e5431a9.zip
1 files changed, 18 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 5e3eb0f087..4d27c5a918 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -281,6 +281,24 @@ Apart from these, the following properties are also available, and may be useful
     overhead per reduce task, so keep it small unless you have a large amount of memory.
   </td>
 </tr>
+<tr>
+  <td><code>spark.shuffle.manager</code></td>
+  <td>HASH</td>
+  <td>
+    Implementation to use for shuffling data. A hash-based shuffle manager is the default, but
+    starting in Spark 1.1 there is an experimental sort-based shuffle manager that is more 
+    memory-efficient in environments with small executors, such as YARN. To use that, change
+    this value to <code>SORT</code>.
+  </td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.sort.bypassMergeThreshold</code></td>
+  <td>200</td>
+  <td>
+    (Advanced) In the sort-based shuffle manager, avoid merge-sorting data if there is no
+    map-side aggregation and there are at most this many reduce partitions.
+  </td>
+</tr>
 </table>
 
 #### Spark UI
author	Matei Zaharia <matei@databricks.com>	2014-08-07 18:04:49 -0700
committer	Reynold Xin <rxin@apache.org>	2014-08-07 18:04:49 -0700
commit	6906b69cf568015f20c7d7c77cbcba650e5431a9 (patch)
tree	fbec9283bed82e0b2f0baa8c25834f3a54fac5ba /docs/configuration.md
parent	32096c2aed9978cfb9a904b4f56bb61800d17e9e (diff)
download	spark-6906b69cf568015f20c7d7c77cbcba650e5431a9.tar.gz spark-6906b69cf568015f20c7d7c77cbcba650e5431a9.tar.bz2 spark-6906b69cf568015f20c7d7c77cbcba650e5431a9.zip