[SPARK-3240] Adding known issue for MESOS-1688

When using Mesos with the fine-grained mode, a Spark job can run into a dead lock on low allocatable memory on Mesos slaves. As a work-around 32 MB (= Mesos MIN_MEM) are allocated for each task, to ensure Mesos making new offers after task completion. From my perspective, it would be better to fix this problem in Mesos by dropping the constraint on memory for offers, but as temporary solution this patch helps to avoid the dead lock on current Mesos versions. See [[MESOS-1688] No offers if no memory is allocatable](https://issues.apache.org/jira/browse/MESOS-1688) for details for this problem. Author: Martin Weindel <martin.weindel@gmail.com> Closes #1860 from MartinWeindel/master and squashes the following commits: 5762030 [Martin Weindel] reverting work-around a6bf837 [Martin Weindel] added known issue for issue MESOS-1688 d9d2ca6 [Martin Weindel] work around for problem with Mesos offering semantic (see [https://issues.apache.org/jira/browse/MESOS-1688])
author: Martin Weindel <martin.weindel@gmail.com> 2014-08-26 18:30:39 -0700
committer: Matei Zaharia <matei@databricks.com> 2014-08-26 18:30:45 -0700
commit: be043e3f20c6562482f9e4e739d8bb3fc9c1f201 (patch)
tree: 84d23fdbe1685d009963df7e774facd42ff2462a /docs/running-on-mesos.md
parent: 727cb25bcc29481d6b744abef1ca091e64b5f91f (diff)
download: spark-be043e3f20c6562482f9e4e739d8bb3fc9c1f201.tar.gz
spark-be043e3f20c6562482f9e4e739d8bb3fc9c1f201.tar.bz2
spark-be043e3f20c6562482f9e4e739d8bb3fc9c1f201.zip
1 files changed, 2 insertions, 0 deletions
diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md
index 9998dddc65..1073abb202 100644
--- a/docs/running-on-mesos.md
+++ b/docs/running-on-mesos.md
@@ -165,6 +165,8 @@ acquire. By default, it will acquire *all* cores in the cluster (that get offere
 only makes sense if you run just one application at a time. You can cap the maximum number of cores
 using `conf.set("spark.cores.max", "10")` (for example).
 
+# Known issues
+- When using the "fine-grained" mode, make sure that your executors always leave 32 MB free on the slaves. Otherwise it can happen that your Spark job does not proceed anymore. Currently, Apache Mesos only offers resources if there are at least 32 MB memory allocatable. But as Spark allocates memory only for the executor and cpu only for tasks, it can happen on high slave memory usage that no new tasks will be started anymore. More details can be found in [MESOS-1688](https://issues.apache.org/jira/browse/MESOS-1688). Alternatively use the "coarse-gained" mode, which is not affected by this issue.
 
 # Running Alongside Hadoop
author	Martin Weindel <martin.weindel@gmail.com>	2014-08-26 18:30:39 -0700
committer	Matei Zaharia <matei@databricks.com>	2014-08-26 18:30:45 -0700
commit	be043e3f20c6562482f9e4e739d8bb3fc9c1f201 (patch)
tree	84d23fdbe1685d009963df7e774facd42ff2462a /docs/running-on-mesos.md
parent	727cb25bcc29481d6b744abef1ca091e64b5f91f (diff)
download	spark-be043e3f20c6562482f9e4e739d8bb3fc9c1f201.tar.gz spark-be043e3f20c6562482f9e4e739d8bb3fc9c1f201.tar.bz2 spark-be043e3f20c6562482f9e4e739d8bb3fc9c1f201.zip