aboutsummaryrefslogtreecommitdiff
path: root/docs/configuration.md
diff options
context:
space:
mode:
authorCodingCat <zhunansjtu@gmail.com>2015-02-23 11:29:25 +0000
committerSean Owen <sowen@cloudera.com>2015-02-23 11:29:25 +0000
commit242d49584c6aa21d928db2552033661950f760a5 (patch)
treef00b6160f3234934ffadec0946e139fe1f1b434b /docs/configuration.md
parent757b14b862a1d39c1bad7b321dae1a3ea8338fbb (diff)
downloadspark-242d49584c6aa21d928db2552033661950f760a5.tar.gz
spark-242d49584c6aa21d928db2552033661950f760a5.tar.bz2
spark-242d49584c6aa21d928db2552033661950f760a5.zip
[SPARK-5724] fix the misconfiguration in AkkaUtils
https://issues.apache.org/jira/browse/SPARK-5724 In AkkaUtil, we set several failure detector related the parameters as following ``` al akkaConf = ConfigFactory.parseMap(conf.getAkkaConf.toMap[String, String]) .withFallback(akkaSslConfig).withFallback(ConfigFactory.parseString( s""" |akka.daemonic = on |akka.loggers = [""akka.event.slf4j.Slf4jLogger""] |akka.stdout-loglevel = "ERROR" |akka.jvm-exit-on-fatal-error = off |akka.remote.require-cookie = "$requireCookie" |akka.remote.secure-cookie = "$secureCookie" |akka.remote.transport-failure-detector.heartbeat-interval = $akkaHeartBeatInterval s |akka.remote.transport-failure-detector.acceptable-heartbeat-pause = $akkaHeartBeatPauses s |akka.remote.transport-failure-detector.threshold = $akkaFailureDetector |akka.actor.provider = "akka.remote.RemoteActorRefProvider" |akka.remote.netty.tcp.transport-class = "akka.remote.transport.netty.NettyTransport" |akka.remote.netty.tcp.hostname = "$host" |akka.remote.netty.tcp.port = $port |akka.remote.netty.tcp.tcp-nodelay = on |akka.remote.netty.tcp.connection-timeout = $akkaTimeout s |akka.remote.netty.tcp.maximum-frame-size = ${akkaFrameSize}B |akka.remote.netty.tcp.execution-pool-size = $akkaThreads |akka.actor.default-dispatcher.throughput = $akkaBatchSize |akka.log-config-on-start = $logAkkaConfig |akka.remote.log-remote-lifecycle-events = $lifecycleEvents |akka.log-dead-letters = $lifecycleEvents |akka.log-dead-letters-during-shutdown = $lifecycleEvents """.stripMargin)) ``` Actually, we do not have any parameter naming "akka.remote.transport-failure-detector.threshold" see: http://doc.akka.io/docs/akka/2.3.4/general/configuration.html what we have is "akka.remote.watch-failure-detector.threshold" Author: CodingCat <zhunansjtu@gmail.com> Closes #4512 from CodingCat/SPARK-5724 and squashes the following commits: bafe56e [CodingCat] fix the grammar in configuration doc 338296e [CodingCat] remove failure-detector related info 8bfcfd4 [CodingCat] fix the misconfiguration in AkkaUtils
Diffstat (limited to 'docs/configuration.md')
-rw-r--r--docs/configuration.md36
1 files changed, 12 insertions, 24 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 541695c83a..c8db338cb6 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -903,36 +903,24 @@ Apart from these, the following properties are also available, and may be useful
<td><code>spark.akka.heartbeat.pauses</code></td>
<td>6000</td>
<td>
- This is set to a larger value to disable failure detector that comes inbuilt akka. It can be
- enabled again, if you plan to use this feature (Not recommended). Acceptable heart beat pause
- in seconds for akka. This can be used to control sensitivity to gc pauses. Tune this in
- combination of `spark.akka.heartbeat.interval` and `spark.akka.failure-detector.threshold`
- if you need to.
- </td>
-</tr>
-<tr>
- <td><code>spark.akka.failure-detector.threshold</code></td>
- <td>300.0</td>
- <td>
- This is set to a larger value to disable failure detector that comes inbuilt akka. It can be
- enabled again, if you plan to use this feature (Not recommended). This maps to akka's
- `akka.remote.transport-failure-detector.threshold`. Tune this in combination of
- `spark.akka.heartbeat.pauses` and `spark.akka.heartbeat.interval` if you need to.
+ This is set to a larger value to disable the transport failure detector that comes built in to Akka.
+ It can be enabled again, if you plan to use this feature (Not recommended). Acceptable heart
+ beat pause in seconds for Akka. This can be used to control sensitivity to GC pauses. Tune
+ this along with `spark.akka.heartbeat.interval` if you need to.
</td>
</tr>
<tr>
<td><code>spark.akka.heartbeat.interval</code></td>
<td>1000</td>
<td>
- This is set to a larger value to disable failure detector that comes inbuilt akka. It can be
- enabled again, if you plan to use this feature (Not recommended). A larger interval value in
- seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative for
- akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` and
- `spark.akka.failure-detector.threshold` if you need to. Only positive use case for using
- failure detector can be, a sensistive failure detector can help evict rogue executors really
- quick. However this is usually not the case as gc pauses and network lags are expected in a
- real Spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats
- between nodes leading to flooding the network with those.
+ This is set to a larger value to disable the transport failure detector that comes built in to Akka.
+ It can be enabled again, if you plan to use this feature (Not recommended). A larger interval
+ value in seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative
+ for Akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` if you need
+ to. A likely positive use case for using failure detector would be: a sensistive failure detector
+ can help evict rogue executors quickly. However this is usually not the case as GC pauses
+ and network lags are expected in a real Spark cluster. Apart from that enabling this leads to
+ a lot of exchanges of heart beats between nodes leading to flooding the network with those.
</td>
</tr>
<tr>