aboutsummaryrefslogtreecommitdiff
path: root/docs/running-on-mesos.md
diff options
context:
space:
mode:
authorMichael Gummelt <mgummelt@mesosphere.io>2016-07-06 15:02:45 -0700
committerReynold Xin <rxin@databricks.com>2016-07-06 15:02:45 -0700
commit9c041990cf4d0138d9104207b5c2e7a319b42615 (patch)
tree525b5540536dffa6c2cc9d616dc111a7e6e4c972 /docs/running-on-mesos.md
parenta8f89df3b391e7a3fa9f73d9ec730d6eaa95bb09 (diff)
downloadspark-9c041990cf4d0138d9104207b5c2e7a319b42615.tar.gz
spark-9c041990cf4d0138d9104207b5c2e7a319b42615.tar.bz2
spark-9c041990cf4d0138d9104207b5c2e7a319b42615.zip
[MESOS] expand coarse-grained mode docs
## What changes were proposed in this pull request? docs ## How was this patch tested? viewed the docs in github Author: Michael Gummelt <mgummelt@mesosphere.io> Closes #14059 from mgummelt/coarse-grained.
Diffstat (limited to 'docs/running-on-mesos.md')
-rw-r--r--docs/running-on-mesos.md77
1 files changed, 51 insertions, 26 deletions
diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md
index 4a0ab623c1..8ab5f30220 100644
--- a/docs/running-on-mesos.md
+++ b/docs/running-on-mesos.md
@@ -180,30 +180,53 @@ Note that jars or python files that are passed to spark-submit should be URIs re
# Mesos Run Modes
-Spark can run over Mesos in two modes: "coarse-grained" (default) and "fine-grained".
-
-The "coarse-grained" mode will launch only *one* long-running Spark task on each Mesos
-machine, and dynamically schedule its own "mini-tasks" within it. The benefit is much lower startup
-overhead, but at the cost of reserving the Mesos resources for the complete duration of the
-application.
-
-Coarse-grained is the default mode. You can also set `spark.mesos.coarse` property to true
-to turn it on explicitly in [SparkConf](configuration.html#spark-properties):
-
-{% highlight scala %}
-conf.set("spark.mesos.coarse", "true")
-{% endhighlight %}
-
-In addition, for coarse-grained mode, you can control the maximum number of resources Spark will
-acquire. By default, it will acquire *all* cores in the cluster (that get offered by Mesos), which
-only makes sense if you run just one application at a time. You can cap the maximum number of cores
-using `conf.set("spark.cores.max", "10")` (for example).
-
-In "fine-grained" mode, each Spark task runs as a separate Mesos task. This allows
-multiple instances of Spark (and other frameworks) to share machines at a very fine granularity,
-where each application gets more or fewer machines as it ramps up and down, but it comes with an
-additional overhead in launching each task. This mode may be inappropriate for low-latency
-requirements like interactive queries or serving web requests.
+Spark can run over Mesos in two modes: "coarse-grained" (default) and
+"fine-grained".
+
+## Coarse-Grained
+
+In "coarse-grained" mode, each Spark executor runs as a single Mesos
+task. Spark executors are sized according to the following
+configuration variables:
+
+* Executor memory: `spark.executor.memory`
+* Executor cores: `spark.executor.cores`
+* Number of executors: `spark.cores.max`/`spark.executor.cores`
+
+Please see the [Spark Configuration](configuration.html) page for
+details and default values.
+
+Executors are brought up eagerly when the application starts, until
+`spark.cores.max` is reached. If you don't set `spark.cores.max`, the
+Spark application will reserve all resources offered to it by Mesos,
+so we of course urge you to set this variable in any sort of
+multi-tenant cluster, including one which runs multiple concurrent
+Spark applications.
+
+The scheduler will start executors round-robin on the offers Mesos
+gives it, but there are no spread guarantees, as Mesos does not
+provide such guarantees on the offer stream.
+
+The benefit of coarse-grained mode is much lower startup overhead, but
+at the cost of reserving Mesos resources for the complete duration of
+the application. To configure your job to dynamically adjust to its
+resource requirements, look into
+[Dynamic Allocation](#dynamic-resource-allocation-with-mesos).
+
+## Fine-Grained
+
+In "fine-grained" mode, each Spark task inside the Spark executor runs
+as a separate Mesos task. This allows multiple instances of Spark (and
+other frameworks) to share cores at a very fine granularity, where
+each application gets more or fewer cores as it ramps up and down, but
+it comes with an additional overhead in launching each task. This mode
+may be inappropriate for low-latency requirements like interactive
+queries or serving web requests.
+
+Note that while Spark tasks in fine-grained will relinquish cores as
+they terminate, they will not relinquish memory, as the JVM does not
+give memory back to the Operating System. Neither will executors
+terminate when they're idle.
To run in fine-grained mode, set the `spark.mesos.coarse` property to false in your
[SparkConf](configuration.html#spark-properties):
@@ -212,7 +235,9 @@ To run in fine-grained mode, set the `spark.mesos.coarse` property to false in y
conf.set("spark.mesos.coarse", "false")
{% endhighlight %}
-You may also make use of `spark.mesos.constraints` to set attribute based constraints on mesos resource offers. By default, all resource offers will be accepted.
+You may also make use of `spark.mesos.constraints` to set
+attribute-based constraints on Mesos resource offers. By default, all
+resource offers will be accepted.
{% highlight scala %}
conf.set("spark.mesos.constraints", "os:centos7;us-east-1:false")
@@ -246,7 +271,7 @@ In either case, HDFS runs separately from Hadoop MapReduce, without being schedu
# Dynamic Resource Allocation with Mesos
-Mesos supports dynamic allocation only with coarse-grain mode, which can resize the number of
+Mesos supports dynamic allocation only with coarse-grained mode, which can resize the number of
executors based on statistics of the application. For general information,
see [Dynamic Resource Allocation](job-scheduling.html#dynamic-resource-allocation).