diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/configuration.md | 10 | ||||
-rw-r--r-- | docs/job-scheduling.md | 5 | ||||
-rw-r--r-- | docs/running-on-yarn.md | 27 |
3 files changed, 36 insertions, 6 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index 12ac601296..acaeb83008 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -173,7 +173,7 @@ of the most common options to set are: stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks. - NOTE: In Spark 1.0 and later this will be overriden by SPARK_LOCAL_DIRS (Standalone, Mesos) or + NOTE: In Spark 1.0 and later this will be overridden by SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN) environment variables set by the cluster manager. </td> </tr> @@ -687,10 +687,10 @@ Apart from these, the following properties are also available, and may be useful <td><code>spark.rdd.compress</code></td> <td>false</td> <td> - Whether to compress serialized RDD partitions (e.g. for - <code>StorageLevel.MEMORY_ONLY_SER</code> in Java - and Scala or <code>StorageLevel.MEMORY_ONLY</code> in Python). - Can save substantial space at the cost of some extra CPU time. + Whether to compress serialized RDD partitions (e.g. for + <code>StorageLevel.MEMORY_ONLY_SER</code> in Java + and Scala or <code>StorageLevel.MEMORY_ONLY</code> in Python). + Can save substantial space at the cost of some extra CPU time. </td> </tr> <tr> diff --git a/docs/job-scheduling.md b/docs/job-scheduling.md index 6c587b3f0d..95d47794ea 100644 --- a/docs/job-scheduling.md +++ b/docs/job-scheduling.md @@ -39,7 +39,10 @@ Resource allocation can be configured as follows, based on the cluster type: and optionally set `spark.cores.max` to limit each application's resource share as in the standalone mode. You should also set `spark.executor.memory` to control the executor memory. * **YARN:** The `--num-executors` option to the Spark YARN client controls how many executors it will allocate - on the cluster, while `--executor-memory` and `--executor-cores` control the resources per executor. + on the cluster (`spark.executor.instances` as configuration property), while `--executor-memory` + (`spark.executor.memory` configuration property) and `--executor-cores` (`spark.executor.cores` configuration + property) control the resources per executor. For more information, see the + [YARN Spark Properties](running-on-yarn.html). A second option available on Mesos is _dynamic sharing_ of CPU cores. In this mode, each Spark application still has a fixed and independent memory allocation (set by `spark.executor.memory`), but when the diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index a148c867eb..ad66b9f64a 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -114,6 +114,19 @@ If you need a reference to the proper location to put log files in the YARN so t </td> </tr> <tr> + <td><code>spark.driver.memory</code></td> + <td>1g</td> + <td> + Amount of memory to use for the driver process, i.e. where SparkContext is initialized. + (e.g. <code>1g</code>, <code>2g</code>). + + <br /><em>Note:</em> In client mode, this config must not be set through the <code>SparkConf</code> + directly in your application, because the driver JVM has already started at that point. + Instead, please set this through the <code>--driver-memory</code> command line option + or in your default properties file. + </td> +</tr> +<tr> <td><code>spark.driver.cores</code></td> <td><code>1</code></td> <td> @@ -203,6 +216,13 @@ If you need a reference to the proper location to put log files in the YARN so t </td> </tr> <tr> + <td><code>spark.executor.cores</code></td> + <td>1 in YARN mode, all the available cores on the worker in standalone mode.</td> + <td> + The number of cores to use on each executor. For YARN and standalone mode only. + </td> +</tr> +<tr> <td><code>spark.executor.instances</code></td> <td><code>2</code></td> <td> @@ -210,6 +230,13 @@ If you need a reference to the proper location to put log files in the YARN so t </td> </tr> <tr> + <td><code>spark.executor.memory</code></td> + <td>1g</td> + <td> + Amount of memory to use per executor process (e.g. <code>2g</code>, <code>8g</code>). + </td> +</tr> +<tr> <td><code>spark.yarn.executor.memoryOverhead</code></td> <td>executorMemory * 0.10, with minimum of 384 </td> <td> |