diff options
Diffstat (limited to 'docs/configuration.md')
-rw-r--r-- | docs/configuration.md | 24 |
1 files changed, 22 insertions, 2 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index 6717757781..c1158491f0 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -104,14 +104,25 @@ Apart from these, the following properties are also available, and may be useful </tr> <tr> <td>spark.storage.memoryFraction</td> - <td>0.66</td> + <td>0.6</td> <td> Fraction of Java heap to use for Spark's memory cache. This should not be larger than the "old" - generation of objects in the JVM, which by default is given 2/3 of the heap, but you can increase + generation of objects in the JVM, which by default is given 0.6 of the heap, but you can increase it if you configure your own old generation size. </td> </tr> <tr> + <td>spark.shuffle.memoryFraction</td> + <td>0.3</td> + <td> + Fraction of Java heap to use for aggregation and cogroups during shuffles, if + <code>spark.shuffle.externalSorting</code> is enabled. At any given time, the collective size of + all in-memory maps used for shuffles is bounded by this limit, beyond which the contents will + begin to spill to disk. If spills are often, consider increasing this value at the expense of + <code>spark.storage.memoryFraction</code>. + </td> +</tr> +<tr> <td>spark.mesos.coarse</td> <td>false</td> <td> @@ -377,6 +388,15 @@ Apart from these, the following properties are also available, and may be useful </td> </tr> <tr> + <td>spark.shuffle.externalSorting</td> + <td>true</td> + <td> + If set to "true", spills in-memory maps used for shuffles to disk when a memory threshold is reached. This + threshold is specified by <code>spark.shuffle.memoryFraction</code>. Enable this especially for memory-intensive + applications. + </td> +</tr> +<tr> <td>spark.speculation</td> <td>false</td> <td> |