[SPARK-10492] [STREAMING] [DOCUMENTATION] Update Streaming documentation about rate limiting and backpressure

Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #8656 from tdas/SPARK-10492 and squashes the following commits: 986cdd6 [Tathagata Das] Added information on backpressure
author: Tathagata Das <tathagata.das1565@gmail.com> 2015-09-08 14:54:43 -0700
committer: Tathagata Das <tathagata.das1565@gmail.com> 2015-09-08 14:54:43 -0700
commit: 52b24a602ad615a7f6aa427aefb1c7444c05d298 (patch)
tree: 8c64796dda119ac73ed79d05cf8b028b05e4158c /docs/streaming-programming-guide.md
parent: e6f8d3686016a305a747c5bcc85f46fd4c0cbe83 (diff)
download: spark-52b24a602ad615a7f6aa427aefb1c7444c05d298.tar.gz
spark-52b24a602ad615a7f6aa427aefb1c7444c05d298.tar.bz2
spark-52b24a602ad615a7f6aa427aefb1c7444c05d298.zip
1 files changed, 12 insertions, 1 deletions
diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md
index a1acf83f75..c751dbb417 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -1807,7 +1807,7 @@ To run a Spark Streaming applications, you need to have the following.
     + *Mesos* - [Marathon](https://github.com/mesosphere/marathon) has been used to achieve this
       with Mesos.
 
-- *[Since Spark 1.2] Configuring write ahead logs* - Since Spark 1.2,
+- *Configuring write ahead logs* - Since Spark 1.2,
   we have introduced _write ahead logs_ for achieving strong
   fault-tolerance guarantees. If enabled,  all the data received from a receiver gets written into
   a write ahead log in the configuration checkpoint directory. This prevents data loss on driver
@@ -1822,6 +1822,17 @@ To run a Spark Streaming applications, you need to have the following.
   stored in a replicated storage system. This can be done by setting the storage level for the
   input stream to `StorageLevel.MEMORY_AND_DISK_SER`.
 
+- *Setting the max receiving rate* - If the cluster resources is not large enough for the streaming
+  application to process data as fast as it is being received, the receivers can be rate limited
+  by setting a maximum rate limit in terms of records / sec.
+  See the [configuration parameters](configuration.html#spark-streaming)
+  `spark.streaming.receiver.maxRate` for receivers and `spark.streaming.kafka.maxRatePerPartition`
+  for Direct Kafka approach. In Spark 1.5, we have introduced a feature called *backpressure* that
+  eliminate the need to set this rate limit, as Spark Streaming automatically figures out the
+  rate limits and dynamically adjusts them if the processing conditions change. This backpressure
+  can be enabled by setting the [configuration parameter](configuration.html#spark-streaming)
+  `spark.streaming.backpressure.enabled` to `true`.
+
 ### Upgrading Application Code
 {:.no_toc}
author	Tathagata Das <tathagata.das1565@gmail.com>	2015-09-08 14:54:43 -0700
committer	Tathagata Das <tathagata.das1565@gmail.com>	2015-09-08 14:54:43 -0700
commit	52b24a602ad615a7f6aa427aefb1c7444c05d298 (patch)
tree	8c64796dda119ac73ed79d05cf8b028b05e4158c /docs/streaming-programming-guide.md
parent	e6f8d3686016a305a747c5bcc85f46fd4c0cbe83 (diff)
download	spark-52b24a602ad615a7f6aa427aefb1c7444c05d298.tar.gz spark-52b24a602ad615a7f6aa427aefb1c7444c05d298.tar.bz2 spark-52b24a602ad615a7f6aa427aefb1c7444c05d298.zip