[SPARK-6128][Streaming][Documentation] Updates to Spark Streaming Programming Guide

Updates to the documentation are as follows: - Added information on Kafka Direct API and Kafka Python API - Added joins to the main streaming guide - Improved details on the fault-tolerance semantics Generated docs located here http://people.apache.org/~tdas/spark-1.3.0-temp-docs/streaming-programming-guide.html#fault-tolerance-semantics More things to add: - Configuration for Kafka receive rate - May be add concurrentJobs Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #4956 from tdas/streaming-guide-update-1.3 and squashes the following commits: 819408c [Tathagata Das] Minor fixes. debe484 [Tathagata Das] Added DataFrames and MLlib 380cf8d [Tathagata Das] Fix link 04167a6 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-guide-update-1.3 0b77486 [Tathagata Das] Updates based on Josh's comments. 86c4c2a [Tathagata Das] Updated streaming guides 82de92a [Tathagata Das] Add Kafka to Python api docs
author: Tathagata Das <tathagata.das1565@gmail.com> 2015-03-11 18:48:21 -0700
committer: Tathagata Das <tathagata.das1565@gmail.com> 2015-03-11 18:48:21 -0700
commit: cd3b68d93a01f11bd3d5a441b341cb33d227e900 (patch)
tree: a427f6dbdae218857ec6e8de066b76bf0f43f8ed /docs/configuration.md
parent: 51a79a770a8356bd0ed244af5ca7f1c44c9437d2 (diff)
download: spark-cd3b68d93a01f11bd3d5a441b341cb33d227e900.tar.gz
spark-cd3b68d93a01f11bd3d5a441b341cb33d227e900.tar.bz2
spark-cd3b68d93a01f11bd3d5a441b341cb33d227e900.zip
1 files changed, 12 insertions, 2 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index ae90fe1f8f..a7116fbece 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1345,9 +1345,9 @@ Apart from these, the following properties are also available, and may be useful
 </tr>
 <tr>
   <td><code>spark.streaming.receiver.maxRate</code></td>
-  <td>infinite</td>
+  <td>not set</td>
   <td>
-    Maximum number records per second at which each receiver will receive data.
+    Maximum rate (number of records per second) at which each receiver will receive data.
     Effectively, each stream will consume at most this number of records per second.
     Setting this configuration to 0 or a negative number will put no limit on the rate.
     See the <a href="streaming-programming-guide.html#deploying-applications">deployment guide</a>
@@ -1375,6 +1375,16 @@ Apart from these, the following properties are also available, and may be useful
     higher memory usage in Spark.
   </td>
 </tr>
+<tr>
+  <td><code>spark.streaming.kafka.maxRatePerPartition</code></td>
+  <td>not set</td>
+  <td>
+    Maximum rate (number of records per second) at which data will be read from each Kafka
+    partition when using the new Kafka direct stream API. See the
+    <a href="streaming-kafka-integration.html">Kafka Integration guide</a>
+    for more details.
+  </td>
+</tr>
 </table>
 
 #### Cluster Managers
author	Tathagata Das <tathagata.das1565@gmail.com>	2015-03-11 18:48:21 -0700
committer	Tathagata Das <tathagata.das1565@gmail.com>	2015-03-11 18:48:21 -0700
commit	cd3b68d93a01f11bd3d5a441b341cb33d227e900 (patch)
tree	a427f6dbdae218857ec6e8de066b76bf0f43f8ed /docs/configuration.md
parent	51a79a770a8356bd0ed244af5ca7f1c44c9437d2 (diff)
download	spark-cd3b68d93a01f11bd3d5a441b341cb33d227e900.tar.gz spark-cd3b68d93a01f11bd3d5a441b341cb33d227e900.tar.bz2 spark-cd3b68d93a01f11bd3d5a441b341cb33d227e900.zip