diff options
Diffstat (limited to 'docs/configuration.md')
-rw-r--r-- | docs/configuration.md | 18 |
1 files changed, 18 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index 5e3eb0f087..4d27c5a918 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -281,6 +281,24 @@ Apart from these, the following properties are also available, and may be useful overhead per reduce task, so keep it small unless you have a large amount of memory. </td> </tr> +<tr> + <td><code>spark.shuffle.manager</code></td> + <td>HASH</td> + <td> + Implementation to use for shuffling data. A hash-based shuffle manager is the default, but + starting in Spark 1.1 there is an experimental sort-based shuffle manager that is more + memory-efficient in environments with small executors, such as YARN. To use that, change + this value to <code>SORT</code>. + </td> +</tr> +<tr> + <td><code>spark.shuffle.sort.bypassMergeThreshold</code></td> + <td>200</td> + <td> + (Advanced) In the sort-based shuffle manager, avoid merge-sorting data if there is no + map-side aggregation and there are at most this many reduce partitions. + </td> +</tr> </table> #### Spark UI |