Modified shuffle to limit the maximum outstanding data size in bytes,

instead of the maximum number of outstanding fetches. This should make it faster when there are many small map output files, as well as more robust to overallocating memory on large map outputs.
author: Matei Zaharia <matei@eecs.berkeley.edu> 2012-10-06 20:07:10 -0700
committer: Matei Zaharia <matei@eecs.berkeley.edu> 2012-10-06 20:07:10 -0700
commit: dc28a3ac0a052f7327d03de76c3b153cda2b616a (patch)
tree: 953ac4550c3e49b6e75772da76186d714b2caeaa /docs
parent: 9a3b3f32a3ccb849293180a899377e8468f7544a (diff)
download: spark-dc28a3ac0a052f7327d03de76c3b153cda2b616a.tar.gz
spark-dc28a3ac0a052f7327d03de76c3b153cda2b616a.tar.bz2
spark-dc28a3ac0a052f7327d03de76c3b153cda2b616a.zip
1 files changed, 5 insertions, 3 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index fa7123af1b..0987f7f7b1 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -139,10 +139,12 @@ Apart from these, the following properties are also available, and may be useful
   </td>
 </tr>
 <tr>
-  <td>spark.blockManager.parallelFetches</td>
-  <td>4</td>
+  <td>spark.reducer.maxMbInFlight</td>
+  <td>48</td>
   <td>
-    Number of map output files to fetch concurrently from each reduce task.
+    Maximum size (in megabytes) of map outputs to fetch simultaneously from each reduce task. Since
+    each output requires us to create a buffer to receive it, this represents a fixed memory overhead
+    per reduce task, so keep it small unless you have a large amount of memory.
   </td>
 </tr>
 <tr>
author	Matei Zaharia <matei@eecs.berkeley.edu>	2012-10-06 20:07:10 -0700
committer	Matei Zaharia <matei@eecs.berkeley.edu>	2012-10-06 20:07:10 -0700
commit	dc28a3ac0a052f7327d03de76c3b153cda2b616a (patch)
tree	953ac4550c3e49b6e75772da76186d714b2caeaa /docs
parent	9a3b3f32a3ccb849293180a899377e8468f7544a (diff)
download	spark-dc28a3ac0a052f7327d03de76c3b153cda2b616a.tar.gz spark-dc28a3ac0a052f7327d03de76c3b153cda2b616a.tar.bz2 spark-dc28a3ac0a052f7327d03de76c3b153cda2b616a.zip