aboutsummaryrefslogtreecommitdiff
path: root/docs/configuration.md
diff options
context:
space:
mode:
authorJosh Rosen <joshrosen@databricks.com>2015-09-19 21:40:21 -0700
committerReynold Xin <rxin@databricks.com>2015-09-19 21:40:21 -0700
commit2117eea71ece825fbc3797c8b38184ae221f5223 (patch)
tree06481ef1968367118e89779335e24245f57f2017 /docs/configuration.md
parente789000b88a6bd840f821c53f42c08b97dc02496 (diff)
downloadspark-2117eea71ece825fbc3797c8b38184ae221f5223.tar.gz
spark-2117eea71ece825fbc3797c8b38184ae221f5223.tar.bz2
spark-2117eea71ece825fbc3797c8b38184ae221f5223.zip
[SPARK-10710] Remove ability to disable spilling in core and SQL
It does not make much sense to set `spark.shuffle.spill` or `spark.sql.planner.externalSort` to false: I believe that these configurations were initially added as "escape hatches" to guard against bugs in the external operators, but these operators are now mature and well-tested. In addition, these configurations are not handled in a consistent way anymore: SQL's Tungsten codepath ignores these configurations and will continue to use spilling operators. Similarly, Spark Core's `tungsten-sort` shuffle manager does not respect `spark.shuffle.spill=false`. This pull request removes these configurations, adds warnings at the appropriate places, and deletes a large amount of code which was only used in code paths that did not support spilling. Author: Josh Rosen <joshrosen@databricks.com> Closes #8831 from JoshRosen/remove-ability-to-disable-spilling.
Diffstat (limited to 'docs/configuration.md')
-rw-r--r--docs/configuration.md14
1 files changed, 3 insertions, 11 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 3700051efb..5ec097c78a 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -69,7 +69,7 @@ val sc = new SparkContext(new SparkConf())
Then, you can supply configuration values at runtime:
{% highlight bash %}
-./bin/spark-submit --name "My app" --master local[4] --conf spark.shuffle.spill=false
+./bin/spark-submit --name "My app" --master local[4] --conf spark.eventLog.enabled=false
--conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" myApp.jar
{% endhighlight %}
@@ -449,8 +449,8 @@ Apart from these, the following properties are also available, and may be useful
<td><code>spark.shuffle.memoryFraction</code></td>
<td>0.2</td>
<td>
- Fraction of Java heap to use for aggregation and cogroups during shuffles, if
- <code>spark.shuffle.spill</code> is true. At any given time, the collective size of
+ Fraction of Java heap to use for aggregation and cogroups during shuffles.
+ At any given time, the collective size of
all in-memory maps used for shuffles is bounded by this limit, beyond which the contents will
begin to spill to disk. If spills are often, consider increasing this value at the expense of
<code>spark.storage.memoryFraction</code>.
@@ -484,14 +484,6 @@ Apart from these, the following properties are also available, and may be useful
</td>
</tr>
<tr>
- <td><code>spark.shuffle.spill</code></td>
- <td>true</td>
- <td>
- If set to "true", limits the amount of memory used during reduces by spilling data out to disk.
- This spilling threshold is specified by <code>spark.shuffle.memoryFraction</code>.
- </td>
-</tr>
-<tr>
<td><code>spark.shuffle.spill.compress</code></td>
<td>true</td>
<td>