[SPARK-12507][STREAMING][DOCUMENT] Expose closeFileAfterWrite and allowBatching configurations for Streaming

/cc tdas brkyvz Author: Shixiong Zhu <shixiong@databricks.com> Closes #10453 from zsxwing/streaming-conf.
author: Shixiong Zhu <shixiong@databricks.com> 2016-01-07 17:37:46 -0800
committer: Tathagata Das <tathagata.das1565@gmail.com> 2016-01-07 17:37:46 -0800
commit: c94199e977279d9b4658297e8108b46bdf30157b (patch)
tree: 0f33916993f15858184e52647ced22521ee165bb /docs/streaming-programming-guide.md
parent: 5a4021998ab0f1c8bbb610eceecdf879d149a7b8 (diff)
download: spark-c94199e977279d9b4658297e8108b46bdf30157b.tar.gz
spark-c94199e977279d9b4658297e8108b46bdf30157b.tar.bz2
spark-c94199e977279d9b4658297e8108b46bdf30157b.zip
1 files changed, 5 insertions, 7 deletions
diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md
index 3b071c7da5..1edc0fe347 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -1985,7 +1985,11 @@ To run a Spark Streaming applications, you need to have the following.
   to increase aggregate throughput. Additionally, it is recommended that the replication of the
   received data within Spark be disabled when the write ahead log is enabled as the log is already
   stored in a replicated storage system. This can be done by setting the storage level for the
-  input stream to `StorageLevel.MEMORY_AND_DISK_SER`.
+  input stream to `StorageLevel.MEMORY_AND_DISK_SER`. While using S3 (or any file system that
+  does not support flushing) for _write ahead logs_, please remember to enable
+  `spark.streaming.driver.writeAheadLog.closeFileAfterWrite` and
+  `spark.streaming.receiver.writeAheadLog.closeFileAfterWrite`. See
+  [Spark Streaming Configuration](configuration.html#spark-streaming) for more details.
 
 - *Setting the max receiving rate* - If the cluster resources is not large enough for the streaming
   application to process data as fast as it is being received, the receivers can be rate limited
@@ -2023,12 +2027,6 @@ contains serialized Scala/Java/Python objects and trying to deserialize objects
 modified classes may lead to errors. In this case, either start the upgraded app with a different
 checkpoint directory, or delete the previous checkpoint directory.
 
-### Other Considerations
-{:.no_toc}
-If the data is being received by the receivers faster than what can be processed,
-you can limit the rate by setting the [configuration parameter](configuration.html#spark-streaming)
-`spark.streaming.receiver.maxRate`.
-
 ***
 
 ## Monitoring Applications
author	Shixiong Zhu <shixiong@databricks.com>	2016-01-07 17:37:46 -0800
committer	Tathagata Das <tathagata.das1565@gmail.com>	2016-01-07 17:37:46 -0800
commit	c94199e977279d9b4658297e8108b46bdf30157b (patch)
tree	0f33916993f15858184e52647ced22521ee165bb /docs/streaming-programming-guide.md
parent	5a4021998ab0f1c8bbb610eceecdf879d149a7b8 (diff)
download	spark-c94199e977279d9b4658297e8108b46bdf30157b.tar.gz spark-c94199e977279d9b4658297e8108b46bdf30157b.tar.bz2 spark-c94199e977279d9b4658297e8108b46bdf30157b.zip