diff options
author | Aaron Davidson <aaron@databricks.com> | 2014-12-22 13:09:22 -0800 |
---|---|---|
committer | Patrick Wendell <pwendell@gmail.com> | 2014-12-22 13:09:31 -0800 |
commit | 4b2bdedface53263d004b5c0306f2f2483a9c4bb (patch) | |
tree | 713801d6e16f1040c8da82d188550917f214074e /docs/configuration.md | |
parent | c7396b5887afe1bbe344ffcf06ef266847c378ac (diff) | |
download | spark-4b2bdedface53263d004b5c0306f2f2483a9c4bb.tar.gz spark-4b2bdedface53263d004b5c0306f2f2483a9c4bb.tar.bz2 spark-4b2bdedface53263d004b5c0306f2f2483a9c4bb.zip |
[SPARK-4864] Add documentation to Netty-based configs
Author: Aaron Davidson <aaron@databricks.com>
Closes #3713 from aarondav/netty-configs and squashes the following commits:
8a8b373 [Aaron Davidson] Address Patrick's comments
3b1f84e [Aaron Davidson] [SPARK-4864] Add documentation to Netty-based configs
(cherry picked from commit fbca6b6ce293b1997b40abeb9ab77b8a969a5fc9)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
Diffstat (limited to 'docs/configuration.md')
-rw-r--r-- | docs/configuration.md | 35 |
1 files changed, 35 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index ff30eac4d9..60fde1386a 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -852,6 +852,41 @@ Apart from these, the following properties are also available, and may be useful between nodes leading to flooding the network with those. </td> </tr> +<tr> + <td><code>spark.shuffle.io.preferDirectBufs</code></td> + <td>true</td> + <td> + (Netty only) Off-heap buffers are used to reduce garbage collection during shuffle and cache + block transfer. For environments where off-heap memory is tightly limited, users may wish to + turn this off to force all allocations from Netty to be on-heap. + </td> +</tr> +<tr> + <td><code>spark.shuffle.io.numConnectionsPerPeer</code></td> + <td>1</td> + <td> + (Netty only) Connections between hosts are reused in order to reduce connection buildup for + large clusters. For clusters with many hard disks and few hosts, this may result in insufficient + concurrency to saturate all disks, and so users may consider increasing this value. + </td> +</tr> +<tr> + <td><code>spark.shuffle.io.maxRetries</code></td> + <td>3</td> + <td> + (Netty only) Fetches that fail due to IO-related exceptions are automatically retried if this is + set to a non-zero value. This retry logic helps stabilize large shuffles in the face of long GC + pauses or transient network connectivity issues. + </td> +</tr> +<tr> + <td><code>spark.shuffle.io.retryWait</code></td> + <td>5</td> + <td> + (Netty only) Seconds to wait between retries of fetches. The maximum delay caused by retrying + is simply <code>maxRetries * retryWait</code>, by default 15 seconds. + </td> +</tr> </table> #### Scheduling |