aboutsummaryrefslogtreecommitdiff
path: root/docs/configuration.md
diff options
context:
space:
mode:
authorSandy Ryza <sandy@cloudera.com>2014-11-10 12:40:41 -0800
committerPatrick Wendell <pwendell@gmail.com>2014-11-10 12:40:41 -0800
commitc6f4e704214097f17d2d6abfbfef4bb208e4339f (patch)
treed41e3fc17f65d64515fd962152245b95ca36def9 /docs/configuration.md
parentc5db8e2c07e442654f3d368608108e714e080184 (diff)
downloadspark-c6f4e704214097f17d2d6abfbfef4bb208e4339f.tar.gz
spark-c6f4e704214097f17d2d6abfbfef4bb208e4339f.tar.bz2
spark-c6f4e704214097f17d2d6abfbfef4bb208e4339f.zip
SPARK-4230. Doc for spark.default.parallelism is incorrect
Author: Sandy Ryza <sandy@cloudera.com> Closes #3107 from sryza/sandy-spark-4230 and squashes the following commits: 37a1d19 [Sandy Ryza] Clear up a couple things 34d53de [Sandy Ryza] SPARK-4230. Doc for spark.default.parallelism is incorrect
Diffstat (limited to 'docs/configuration.md')
-rw-r--r--docs/configuration.md7
1 files changed, 5 insertions, 2 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 0f9eb81f6e..f0b396e21f 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -562,6 +562,9 @@ Apart from these, the following properties are also available, and may be useful
<tr>
<td><code>spark.default.parallelism</code></td>
<td>
+ For distributed shuffle operations like <code>reduceByKey</code> and <code>join</code>, the
+ largest number of partitions in a parent RDD. For operations like <code>parallelize</code>
+ with no parent RDDs, it depends on the cluster manager:
<ul>
<li>Local mode: number of cores on the local machine</li>
<li>Mesos fine grained mode: 8</li>
@@ -569,8 +572,8 @@ Apart from these, the following properties are also available, and may be useful
</ul>
</td>
<td>
- Default number of tasks to use across the cluster for distributed shuffle operations
- (<code>groupByKey</code>, <code>reduceByKey</code>, etc) when not set by user.
+ Default number of partitions in RDDs returned by transformations like <code>join</code>,
+ <code>reduceByKey</code>, and <code>parallelize</code> when not set by user.
</td>
</tr>
<tr>