diff options
Diffstat (limited to 'docs/configuration.md')
-rw-r--r-- | docs/configuration.md | 37 |
1 files changed, 32 insertions, 5 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index d8317ea97c..a7054b4321 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -198,27 +198,54 @@ Apart from these, the following properties are also available, and may be useful </td> </tr> <tr> + <td>spark.akka.frameSize</td> + <td>10</td> + <td> + Maximum message size to allow in "control plane" communication (for serialized tasks and task + results), in MB. Increase this if your tasks need to send back large results to the driver + (e.g. using <code>collect()</code> on a large dataset). + </td> +</tr> +<tr> <td>spark.akka.threads</td> <td>4</td> <td> Number of actor threads to use for communication. Can be useful to increase on large clusters - when the master has a lot of CPU cores. + when the driver has a lot of CPU cores. </td> </tr> <tr> - <td>spark.master.host</td> + <td>spark.akka.timeout</td> + <td>20</td> + <td> + Communication timeout between Spark nodes. + </td> +</tr> +<tr> + <td>spark.driver.host</td> <td>(local hostname)</td> <td> - Hostname or IP address for the master to listen on. + Hostname or IP address for the driver to listen on. </td> </tr> <tr> - <td>spark.master.port</td> + <td>spark.driver.port</td> <td>(random)</td> <td> - Port for the master to listen on. + Port for the driver to listen on. + </td> +</tr> +<tr> + <td>spark.cleaner.delay</td> + <td>(disable)</td> + <td> + Duration (minutes) of how long Spark will remember any metadata (stages generated, tasks generated, etc.). + Periodic cleanups will ensure that metadata older than this duration will be forgetten. This is + useful for running Spark for many hours / days (for example, running 24/7 in case of Spark Streaming + applications). Note that any RDD that persists in memory for more than this duration will be cleared as well. </td> </tr> + </table> # Configuring Logging |