From d1d41ccee49a5c093cb61c791c01f64f2076b83e Mon Sep 17 00:00:00 2001 From: Andrew Ash Date: Wed, 14 May 2014 09:45:33 -0700 Subject: SPARK-1818 Freshen Mesos documentation Place more emphasis on using precompiled binary versions of Spark and Mesos instead of encouraging the reader to compile from source. Author: Andrew Ash Closes #756 from ash211/spark-1818 and squashes the following commits: 7ef3b33 [Andrew Ash] Brief explanation of the interactions between Spark and Mesos e7dea8e [Andrew Ash] Add troubleshooting and debugging section 956362d [Andrew Ash] Don't need to pass spark.executor.uri into the spark shell de3353b [Andrew Ash] Wrap to 100char 7ebf6ef [Andrew Ash] Polish on the section on Mesos Master URLs 3dcc2c1 [Andrew Ash] Use --tgz parameter of make-distribution 41b68ed [Andrew Ash] Period at end of sentence; formatting on :5050 8bf2c53 [Andrew Ash] Update site.MESOS_VERSIOn to match /pom.xml 74f2040 [Andrew Ash] SPARK-1818 Freshen Mesos documentation --- docs/running-on-mesos.md | 200 ++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 173 insertions(+), 27 deletions(-) (limited to 'docs/running-on-mesos.md') diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md index 68259f0cb8..ef762aa7b8 100644 --- a/docs/running-on-mesos.md +++ b/docs/running-on-mesos.md @@ -3,19 +3,123 @@ layout: global title: Running Spark on Mesos --- -Spark can run on clusters managed by [Apache Mesos](http://mesos.apache.org/). Follow the steps below to install Mesos and Spark: - -1. Download and build Spark using the instructions [here](index.html). **Note:** Don't forget to consider what version of HDFS you might want to use! -2. Download, build, install, and start Mesos {{site.MESOS_VERSION}} on your cluster. You can download the Mesos distribution from a [mirror](http://www.apache.org/dyn/closer.cgi/mesos/{{site.MESOS_VERSION}}/). See the Mesos [Getting Started](http://mesos.apache.org/gettingstarted) page for more information. **Note:** If you want to run Mesos without installing it into the default paths on your system (e.g., if you don't have administrative privileges to install it), you should also pass the `--prefix` option to `configure` to tell it where to install. For example, pass `--prefix=/home/user/mesos`. By default the prefix is `/usr/local`. -3. Create a Spark "distribution" using `make-distribution.sh`. -4. Rename the `dist` directory created from `make-distribution.sh` to `spark-{{site.SPARK_VERSION}}`. -5. Create a `tar` archive: `tar czf spark-{{site.SPARK_VERSION}}.tar.gz spark-{{site.SPARK_VERSION}}` -6. Upload this archive to HDFS or another place accessible from Mesos via `http://`, e.g., [Amazon Simple Storage Service](http://aws.amazon.com/s3): `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz /path/to/spark-{{site.SPARK_VERSION}}.tar.gz` -7. Create a file called `spark-env.sh` in Spark's `conf` directory, by copying `conf/spark-env.sh.template`, and add the following lines to it: - * `export MESOS_NATIVE_LIBRARY=`. This path is usually `/lib/libmesos.so` (where the prefix is `/usr/local` by default, see above). Also, on Mac OS X, the library is called `libmesos.dylib` instead of `libmesos.so`. - * `export SPARK_EXECUTOR_URI=`. - * `export MASTER=mesos://HOST:PORT` where HOST:PORT is the host and port (default: 5050) of your Mesos master (or `zk://...` if using Mesos with ZooKeeper). -8. To run a Spark application against the cluster, when you create your `SparkContext`, pass the string `mesos://HOST:PORT` as the master URL. In addition, you'll need to set the `spark.executor.uri` property. For example: +# Why Mesos + +Spark can run on hardware clusters managed by [Apache Mesos](http://mesos.apache.org/). + +The advantages of deploying Spark with Mesos include: +- dynamic partitioning between Spark and other + [frameworks](https://mesos.apache.org/documentation/latest/mesos-frameworks/) +- scalable partitioning between multiple instances of Spark + +# How it works + +In a standalone cluster deployment, the cluster manager in the below diagram is a Spark master +instance. When using Mesos, the Mesos master replaces the Spark master as the cluster manager. + +

+ Spark cluster components +

+ +Now when a driver creates a job and starts issuing tasks for scheduling, Mesos determines what +machines handle what tasks. Because it takes into account other frameworks when scheduling these +many short-lived tasks, multiple frameworks can coexist on the same cluster without resorting to a +static partitioning of resources. + +To get started, follow the steps below to install Mesos and deploy Spark jobs via Mesos. + + +# Installing Mesos + +Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} and does not +require any special patches of Mesos. + +If you already have a Mesos cluster running, you can skip this Mesos installation step. + +Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other +frameworks. You can install Mesos using either prebuilt packages or by compiling from source. + +## Prebuilt packages + +The Apache Mesos project only publishes source package releases, no binary releases. But other +third party projects publish binary releases that may be helpful in setting Mesos up. + +One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere: + +1. Download Mesos installation package from [downloads page](http://mesosphere.io/downloads/) +2. Follow their instructions for installation and configuration + +The Mesosphere installation documents suggest setting up ZooKeeper to handle Mesos master failover, +but Mesos can be run without ZooKeeper using a single master as well. + +## From source + +To install Mesos directly from the upstream project rather than a third party, install from source. + +1. Download the Mesos distribution from a + [mirror](http://www.apache.org/dyn/closer.cgi/mesos/{{site.MESOS_VERSION}}/) +2. Follow the Mesos [Getting Started](http://mesos.apache.org/gettingstarted) page for compiling and + installing Mesos + +**Note:** If you want to run Mesos without installing it into the default paths on your system +(e.g., if you lack administrative privileges to install it), you should also pass the +`--prefix` option to `configure` to tell it where to install. For example, pass +`--prefix=/home/user/mesos`. By default the prefix is `/usr/local`. + +## Verification + +To verify that the Mesos cluster is ready for Spark, navigate to the Mesos master webui at port +`:5050` Confirm that all expected machines are present in the slaves tab. + + +# Connecting Spark to Mesos + +To use Mesos from Spark, you need a Spark distribution available in a place accessible by Mesos, and +a Spark driver program configured to connect to Mesos. + +## Uploading Spark Distribution + +When Mesos runs a task on a Mesos slave for the first time, that slave must have a distribution of +Spark available for running the Spark Mesos executor backend. A distribution of Spark is just a +compiled binary version of Spark. + +The Spark distribution can be hosted at any Hadoop URI, including HTTP via `http://`, [Amazon Simple +Storage Service](http://aws.amazon.com/s3) via `s3://`, or HDFS via `hdfs:///`. + +To use a precompiled distribution: + +1. Download a Spark distribution from the Spark [download page](https://spark.apache.org/downloads.html) +2. Upload to hdfs/http/s3 + +To host on HDFS, use the Hadoop fs put command: `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz +/path/to/spark-{{site.SPARK_VERSION}}.tar.gz` + + +Or if you are using a custom-compiled version of Spark, you will need to create a distribution using +the `make-distribution.sh` script included in a Spark source tarball/checkout. + +1. Download and build Spark using the instructions [here](index.html) +2. Create a Spark distribution using `make-distribution.sh --tgz`. +3. Upload archive to http/s3/hdfs + + +## Using a Mesos Master URL + +The Master URLs for Mesos are in the form `mesos://host:5050` for a single-master Mesos +cluster, or `zk://host:2181` for a multi-master Mesos cluster using ZooKeeper. + +The driver also needs some configuration in `spark-env.sh` to interact properly with Mesos: + +1. In `spark.env.sh` set some environment variables: + * `export MESOS_NATIVE_LIBRARY=`. This path is typically + `/lib/libmesos.so` where the prefix is `/usr/local` by default. See Mesos installation + instructions above. On Mac OS X, the library is called `libmesos.dylib` instead of + `libmesos.so`. + * `export SPARK_EXECUTOR_URI=`. +2. Also set `spark.executor.uri` to + +Now when starting a Spark application against the cluster, pass a `mesos://` +or `zk://` URL as the master when creating a `SparkContext`. For example: {% highlight scala %} val conf = new SparkConf() @@ -25,31 +129,73 @@ val conf = new SparkConf() val sc = new SparkContext(conf) {% endhighlight %} +When running a shell the `spark.executor.uri` parameter is inherited from `SPARK_EXECUTOR_URI`, so +it does not need to be redundantly passed in as a system property. + +{% highlight bash %} +./bin/spark-shell --master mesos://host:5050 +{% endhighlight %} + + # Mesos Run Modes -Spark can run over Mesos in two modes: "fine-grained" and "coarse-grained". In fine-grained mode, which is the default, -each Spark task runs as a separate Mesos task. This allows multiple instances of Spark (and other frameworks) to share -machines at a very fine granularity, where each application gets more or fewer machines as it ramps up, but it comes with an -additional overhead in launching each task, which may be inappropriate for low-latency applications (e.g. interactive queries or serving web requests). The coarse-grained mode will instead -launch only *one* long-running Spark task on each Mesos machine, and dynamically schedule its own "mini-tasks" within -it. The benefit is much lower startup overhead, but at the cost of reserving the Mesos resources for the complete duration -of the application. +Spark can run over Mesos in two modes: "fine-grained" (default) and "coarse-grained". + +In "fine-grained" mode (default), each Spark task runs as a separate Mesos task. This allows +multiple instances of Spark (and other frameworks) to share machines at a very fine granularity, +where each application gets more or fewer machines as it ramps up and down, but it comes with an +additional overhead in launching each task. This mode may be inappropriate for low-latency +requirements like interactive queries or serving web requests. + +The "coarse-grained" mode will instead launch only *one* long-running Spark task on each Mesos +machine, and dynamically schedule its own "mini-tasks" within it. The benefit is much lower startup +overhead, but at the cost of reserving the Mesos resources for the complete duration of the +application. -To run in coarse-grained mode, set the `spark.mesos.coarse` property in your [SparkConf](configuration.html#spark-properties): +To run in coarse-grained mode, set the `spark.mesos.coarse` property in your +[SparkConf](configuration.html#spark-properties): {% highlight scala %} conf.set("spark.mesos.coarse", "true") {% endhighlight %} -In addition, for coarse-grained mode, you can control the maximum number of resources Spark will acquire. By default, -it will acquire *all* cores in the cluster (that get offered by Mesos), which only makes sense if you run just one -application at a time. You can cap the maximum number of cores using `conf.set("spark.cores.max", "10")` (for example). +In addition, for coarse-grained mode, you can control the maximum number of resources Spark will +acquire. By default, it will acquire *all* cores in the cluster (that get offered by Mesos), which +only makes sense if you run just one application at a time. You can cap the maximum number of cores +using `conf.set("spark.cores.max", "10")` (for example). # Running Alongside Hadoop -You can run Spark and Mesos alongside your existing Hadoop cluster by just launching them as a separate service on the machines. To access Hadoop data from Spark, just use a hdfs:// URL (typically `hdfs://:9000/path`, but you can find the right URL on your Hadoop Namenode's web UI). +You can run Spark and Mesos alongside your existing Hadoop cluster by just launching them as a +separate service on the machines. To access Hadoop data from Spark, a full hdfs:// URL is required +(typically `hdfs://:9000/path`, but you can find the right URL on your Hadoop Namenode web +UI). + +In addition, it is possible to also run Hadoop MapReduce on Mesos for better resource isolation and +sharing between the two. In this case, Mesos will act as a unified scheduler that assigns cores to +either Hadoop or Spark, as opposed to having them share resources via the Linux scheduler on each +node. Please refer to [Hadoop on Mesos](https://github.com/mesos/hadoop). + +In either case, HDFS runs separately from Hadoop MapReduce, without being scheduled through Mesos. + + +# Troubleshooting and Debugging + +A few places to look during debugging: + +- Mesos master on port `:5050` + - Slaves should appear in the slaves tab + - Spark applications should appear in the frameworks tab + - Tasks should appear in the details of a framework + - Check the stdout and stderr of the sandbox of failed tasks +- Mesos logs + - Master and slave logs are both in `/var/log/mesos` by default -In addition, it is possible to also run Hadoop MapReduce on Mesos, to get better resource isolation and sharing between the two. In this case, Mesos will act as a unified scheduler that assigns cores to either Hadoop or Spark, as opposed to having them share resources via the Linux scheduler on each node. Please refer to [Hadoop on Mesos](https://github.com/mesos/hadoop). +And common pitfalls: -In either case, HDFS runs separately from Hadoop MapReduce, without going through Mesos. +- Spark assembly not reachable/accessible + - Slaves need to be able to download the distribution +- Firewall blocking communications + - Check for messages about failed connections + - Temporarily disable firewalls for debugging and then poke appropriate holes -- cgit v1.2.3