[SPARK-12534][DOC] update documentation to list command line equivalent to properties

Several Spark properties equivalent to Spark submit command line options are missing. Author: felixcheung <felixcheung_m@hotmail.com> Closes #10491 from felixcheung/sparksubmitdoc.
author: felixcheung <felixcheung_m@hotmail.com> 2016-01-21 16:30:20 +0100
committer: Sean Owen <sowen@cloudera.com> 2016-01-21 16:30:20 +0100
commit: 85200c09adc6eb98fadb8505f55cb44e3d8b3390 (patch)
tree: 21321d39a9962c0c7525165773ef64fd98cbe8bf /docs
parent: 1b2a918e59addcdccdf8e011bce075cc9dd07b93 (diff)
download: spark-85200c09adc6eb98fadb8505f55cb44e3d8b3390.tar.gz
spark-85200c09adc6eb98fadb8505f55cb44e3d8b3390.tar.bz2
spark-85200c09adc6eb98fadb8505f55cb44e3d8b3390.zip
3 files changed, 36 insertions, 6 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 12ac601296..acaeb83008 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -173,7 +173,7 @@ of the most common options to set are:
     stored on disk. This should be on a fast, local disk in your system. It can also be a
     comma-separated list of multiple directories on different disks.
 
-    NOTE: In Spark 1.0 and later this will be overriden by SPARK_LOCAL_DIRS (Standalone, Mesos) or
+    NOTE: In Spark 1.0 and later this will be overridden by SPARK_LOCAL_DIRS (Standalone, Mesos) or
     LOCAL_DIRS (YARN) environment variables set by the cluster manager.
   </td>
 </tr>
@@ -687,10 +687,10 @@ Apart from these, the following properties are also available, and may be useful
   <td><code>spark.rdd.compress</code></td>
   <td>false</td>
   <td>
-    Whether to compress serialized RDD partitions (e.g. for 
-    <code>StorageLevel.MEMORY_ONLY_SER</code> in Java 
-    and Scala or <code>StorageLevel.MEMORY_ONLY</code> in Python). 
-    Can save substantial space at the cost of some extra CPU time. 
+    Whether to compress serialized RDD partitions (e.g. for
+    <code>StorageLevel.MEMORY_ONLY_SER</code> in Java
+    and Scala or <code>StorageLevel.MEMORY_ONLY</code> in Python).
+    Can save substantial space at the cost of some extra CPU time.
   </td>
 </tr>
 <tr>
diff --git a/docs/job-scheduling.md b/docs/job-scheduling.md
index 6c587b3f0d..95d47794ea 100644
--- a/docs/job-scheduling.md
+++ b/docs/job-scheduling.md
@@ -39,7 +39,10 @@ Resource allocation can be configured as follows, based on the cluster type:
   and optionally set `spark.cores.max` to limit each application's resource share as in the standalone mode.
   You should also set `spark.executor.memory` to control the executor memory.
 * **YARN:** The `--num-executors` option to the Spark YARN client controls how many executors it will allocate
-  on the cluster, while `--executor-memory` and `--executor-cores` control the resources per executor.
+  on the cluster (`spark.executor.instances` as configuration property), while `--executor-memory`
+  (`spark.executor.memory` configuration property) and `--executor-cores` (`spark.executor.cores` configuration
+  property) control the resources per executor. For more information, see the
+  [YARN Spark Properties](running-on-yarn.html).
 
 A second option available on Mesos is _dynamic sharing_ of CPU cores. In this mode, each Spark application
 still has a fixed and independent memory allocation (set by `spark.executor.memory`), but when the
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index a148c867eb..ad66b9f64a 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -114,6 +114,19 @@ If you need a reference to the proper location to put log files in the YARN so t
   </td>
 </tr>
 <tr>
+  <td><code>spark.driver.memory</code></td>
+  <td>1g</td>
+  <td>
+    Amount of memory to use for the driver process, i.e. where SparkContext is initialized.
+    (e.g. <code>1g</code>, <code>2g</code>).
+
+    <br /><em>Note:</em> In client mode, this config must not be set through the <code>SparkConf</code>
+    directly in your application, because the driver JVM has already started at that point.
+    Instead, please set this through the <code>--driver-memory</code> command line option
+    or in your default properties file.
+  </td>
+</tr>
+<tr>
   <td><code>spark.driver.cores</code></td>
   <td><code>1</code></td>
   <td>
@@ -203,6 +216,13 @@ If you need a reference to the proper location to put log files in the YARN so t
   </td>
 </tr>
 <tr>
+  <td><code>spark.executor.cores</code></td>
+  <td>1 in YARN mode, all the available cores on the worker in standalone mode.</td>
+  <td>
+    The number of cores to use on each executor. For YARN and standalone mode only.
+  </td>
+</tr>
+<tr>
  <td><code>spark.executor.instances</code></td>
   <td><code>2</code></td>
   <td>
@@ -210,6 +230,13 @@ If you need a reference to the proper location to put log files in the YARN so t
   </td>
 </tr>
 <tr>
+  <td><code>spark.executor.memory</code></td>
+  <td>1g</td>
+  <td>
+    Amount of memory to use per executor process (e.g. <code>2g</code>, <code>8g</code>).
+  </td>
+</tr>
+<tr>
  <td><code>spark.yarn.executor.memoryOverhead</code></td>
   <td>executorMemory * 0.10, with minimum of 384 </td>
   <td>
author	felixcheung <felixcheung_m@hotmail.com>	2016-01-21 16:30:20 +0100
committer	Sean Owen <sowen@cloudera.com>	2016-01-21 16:30:20 +0100
commit	85200c09adc6eb98fadb8505f55cb44e3d8b3390 (patch)
tree	21321d39a9962c0c7525165773ef64fd98cbe8bf /docs
parent	1b2a918e59addcdccdf8e011bce075cc9dd07b93 (diff)
download	spark-85200c09adc6eb98fadb8505f55cb44e3d8b3390.tar.gz spark-85200c09adc6eb98fadb8505f55cb44e3d8b3390.tar.bz2 spark-85200c09adc6eb98fadb8505f55cb44e3d8b3390.zip