Merge remote-tracking branch 'apache/master' into error-handling

author: Tathagata Das <tathagata.das1565@gmail.com> 2014-01-11 23:40:57 -0800
committer: Tathagata Das <tathagata.das1565@gmail.com> 2014-01-11 23:40:57 -0800
commit: 18f4889d96b61b59569ec05f64900da1477404d0 (patch)
tree: 3dcb95406babf79d7234aa3851366d5b162dde72 /docs/configuration.md
parent: 4d9b0ab420df383869fa586b229ac00f234b8749 (diff)
parent: 288a878999848adb130041d1e40c14bfc879cec6 (diff)
download: spark-18f4889d96b61b59569ec05f64900da1477404d0.tar.gz
spark-18f4889d96b61b59569ec05f64900da1477404d0.tar.bz2
spark-18f4889d96b61b59569ec05f64900da1477404d0.zip
1 files changed, 21 insertions, 2 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index b1a0e19167..ad75e06fc7 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -104,14 +104,25 @@ Apart from these, the following properties are also available, and may be useful
 </tr>
 <tr>
   <td>spark.storage.memoryFraction</td>
-  <td>0.66</td>
+  <td>0.6</td>
   <td>
     Fraction of Java heap to use for Spark's memory cache. This should not be larger than the "old"
-    generation of objects in the JVM, which by default is given 2/3 of the heap, but you can increase
+    generation of objects in the JVM, which by default is given 0.6 of the heap, but you can increase
     it if you configure your own old generation size.
   </td>
 </tr>
 <tr>
+  <td>spark.shuffle.memoryFraction</td>
+  <td>0.3</td>
+  <td>
+    Fraction of Java heap to use for aggregation and cogroups during shuffles, if
+    <code>spark.shuffle.externalSorting</code> is enabled. At any given time, the collective size of
+    all in-memory maps used for shuffles is bounded by this limit, beyond which the contents will
+    begin to spill to disk. If spills are often, consider increasing this value at the expense of
+    <code>spark.storage.memoryFraction</code>.
+  </td>
+</tr>
+<tr>
   <td>spark.mesos.coarse</td>
   <td>false</td>
   <td>
@@ -377,6 +388,14 @@ Apart from these, the following properties are also available, and may be useful
   </td>
 </tr>
 <tr>
+  <td>spark.shuffle.externalSorting</td>
+  <td>true</td>
+  <td>
+    If set to "true", limits the amount of memory used during reduces by spilling data out to disk. This spilling
+    threshold is specified by <code>spark.shuffle.memoryFraction</code>.
+  </td>
+</tr>
+<tr>
   <td>spark.speculation</td>
   <td>false</td>
   <td>
author	Tathagata Das <tathagata.das1565@gmail.com>	2014-01-11 23:40:57 -0800
committer	Tathagata Das <tathagata.das1565@gmail.com>	2014-01-11 23:40:57 -0800
commit	18f4889d96b61b59569ec05f64900da1477404d0 (patch)
tree	3dcb95406babf79d7234aa3851366d5b162dde72 /docs/configuration.md
parent	4d9b0ab420df383869fa586b229ac00f234b8749 (diff)
parent	288a878999848adb130041d1e40c14bfc879cec6 (diff)
download	spark-18f4889d96b61b59569ec05f64900da1477404d0.tar.gz spark-18f4889d96b61b59569ec05f64900da1477404d0.tar.bz2 spark-18f4889d96b61b59569ec05f64900da1477404d0.zip