aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorTathagata Das <tathagata.das1565@gmail.com>2013-02-20 09:01:29 -0800
committerTathagata Das <tathagata.das1565@gmail.com>2013-02-20 09:01:29 -0800
commitfb9956256d19b9f8f79de43099d2b5fc851bcf08 (patch)
tree8a4bd96ce7f122342dcdc11626dae46b90e0c24c /docs
parent7e30c46aaf337eb95c9ec37ddc2ad79439430c96 (diff)
parent03d847999e8c54684128573b94973544026081b2 (diff)
downloadspark-fb9956256d19b9f8f79de43099d2b5fc851bcf08.tar.gz
spark-fb9956256d19b9f8f79de43099d2b5fc851bcf08.tar.bz2
spark-fb9956256d19b9f8f79de43099d2b5fc851bcf08.zip
Merge branch 'mesos-master' into streaming
Conflicts: core/src/main/scala/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
Diffstat (limited to 'docs')
-rw-r--r--docs/configuration.md10
-rw-r--r--docs/contributing-to-spark.md2
-rw-r--r--docs/scala-programming-guide.md2
-rw-r--r--docs/spark-standalone.md8
-rw-r--r--docs/tuning.md2
5 files changed, 20 insertions, 4 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index a7054b4321..f1ca77aa78 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -198,6 +198,14 @@ Apart from these, the following properties are also available, and may be useful
</td>
</tr>
<tr>
+ <td>spark.worker.timeout</td>
+ <td>60</td>
+ <td>
+ Number of seconds after which the standalone deploy master considers a worker lost if it
+ receives no heartbeats.
+ </td>
+</tr>
+<tr>
<td>spark.akka.frameSize</td>
<td>10</td>
<td>
@@ -218,7 +226,7 @@ Apart from these, the following properties are also available, and may be useful
<td>spark.akka.timeout</td>
<td>20</td>
<td>
- Communication timeout between Spark nodes.
+ Communication timeout between Spark nodes, in seconds.
</td>
</tr>
<tr>
diff --git a/docs/contributing-to-spark.md b/docs/contributing-to-spark.md
index c6e01c62d8..14d0dc856b 100644
--- a/docs/contributing-to-spark.md
+++ b/docs/contributing-to-spark.md
@@ -15,7 +15,7 @@ The Spark team welcomes contributions in the form of GitHub pull requests. Here
But first, make sure that you have [configured a spark-env.sh](configuration.html) with at least
`SCALA_HOME`, as some of the tests try to spawn subprocesses using this.
- Add new unit tests for your code. We use [ScalaTest](http://www.scalatest.org/) for testing. Just add a new Suite in `core/src/test`, or methods to an existing Suite.
-- If you'd like to report a bug but don't have time to fix it, you can still post it to our [issues page](https://github.com/mesos/spark/issues), or email the [mailing list](http://www.spark-project.org/mailing-lists.html).
+- If you'd like to report a bug but don't have time to fix it, you can still post it to our [issue tracker](https://spark-project.atlassian.net), or email the [mailing list](http://www.spark-project.org/mailing-lists.html).
# Licensing of Contributions
diff --git a/docs/scala-programming-guide.md b/docs/scala-programming-guide.md
index 301b330a79..b98718a553 100644
--- a/docs/scala-programming-guide.md
+++ b/docs/scala-programming-guide.md
@@ -203,7 +203,7 @@ A complete list of transformations is available in the [RDD API doc](api/core/in
<tr><th>Action</th><th>Meaning</th></tr>
<tr>
<td> <b>reduce</b>(<i>func</i>) </td>
- <td> Aggregate the elements of the dataset using a function <i>func</i> (which takes two arguments and returns one). The function should be associative so that it can be computed correctly in parallel. </td>
+ <td> Aggregate the elements of the dataset using a function <i>func</i> (which takes two arguments and returns one). The function should be commutative and associative so that it can be computed correctly in parallel. </td>
</tr>
<tr>
<td> <b>collect</b>() </td>
diff --git a/docs/spark-standalone.md b/docs/spark-standalone.md
index bf296221b8..3986c0c79d 100644
--- a/docs/spark-standalone.md
+++ b/docs/spark-standalone.md
@@ -115,6 +115,14 @@ You can optionally configure the cluster further by setting environment variable
<td><code>SPARK_WORKER_WEBUI_PORT</code></td>
<td>Port for the worker web UI (default: 8081)</td>
</tr>
+ <tr>
+ <td><code>SPARK_DAEMON_MEMORY</code></td>
+ <td>Memory to allocate to the Spark master and worker daemons themselves (default: 512m)</td>
+ </tr>
+ <tr>
+ <td><code>SPARK_DAEMON_JAVA_OPTS</code></td>
+ <td>JVM options for the Spark master and worker daemons themselves (default: none)</td>
+ </tr>
</table>
diff --git a/docs/tuning.md b/docs/tuning.md
index 9aaa53cd65..738c530458 100644
--- a/docs/tuning.md
+++ b/docs/tuning.md
@@ -233,7 +233,7 @@ number of cores in your clusters.
## Broadcasting Large Variables
-Using the [broadcast functionality](scala-programming-guide#broadcast-variables)
+Using the [broadcast functionality](scala-programming-guide.html#broadcast-variables)
available in `SparkContext` can greatly reduce the size of each serialized task, and the cost
of launching a job over a cluster. If your tasks use any large object from the driver program
inside of them (e.g. a static lookup table), consider turning it into a broadcast variable.