Change spark.cleaner.delay to spark.cleaner.ttl. Updated docs.

author: Tathagata Das <tathagata.das1565@gmail.com> 2013-02-23 17:42:26 -0800
committer: Tathagata Das <tathagata.das1565@gmail.com> 2013-02-23 17:42:26 -0800
commit: d853aa9658a87d644d483b1fa9d41c29e3ac0673 (patch)
tree: 4a95469875543fd73e795185c335859fd442e71b /docs/streaming-programming-guide.md
parent: 41285eaae3642b73b3ac5007a35cc4e8f1d7d084 (diff)
download: spark-d853aa9658a87d644d483b1fa9d41c29e3ac0673.tar.gz
spark-d853aa9658a87d644d483b1fa9d41c29e3ac0673.tar.bz2
spark-d853aa9658a87d644d483b1fa9d41c29e3ac0673.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md
index 71e1bd4aab..4a5e3e36a5 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -335,7 +335,7 @@ For a Spark Streaming application running on a cluster to be stable, the process
 A good approach to figure out the right batch size for your application is to test it with a conservative batch size (say, 5-10 seconds) and a low data rate. To verify whether the system is able to keep up with data rate, you can check the value of the end-to-end delay experienced by each processed batch (in the Spark master logs, find the line having the phrase "Total delay"). If the delay is maintained to be less than the batch size, then system is stable. Otherwise, if the delay is continuously increasing, it means that the system is unable to keep up and it therefore unstable. Once you have an idea of a stable configuration, you can try increasing the data rate and/or reducing the batch size. Note that momentary increase in the delay due to temporary data rate increases maybe fine as long as the delay reduces back to a low value (i.e., less than batch size).
 
 ## 24/7 Operation
-By default, Spark does not forget any of the metadata (RDDs generated, stages processed, etc.). But for a Spark Streaming application to operate 24/7, it is necessary for Spark to do periodic cleanup of it metadata. This can be enabled by setting the Java system property `spark.cleaner.delay` to the number of seconds you want any metadata to persist. For example, setting `spark.cleaner.delay` to 600 would cause Spark periodically cleanup all metadata and persisted RDDs that are older than 10 minutes. Note, that this property needs to be set before the SparkContext is created.
+By default, Spark does not forget any of the metadata (RDDs generated, stages processed, etc.). But for a Spark Streaming application to operate 24/7, it is necessary for Spark to do periodic cleanup of it metadata. This can be enabled by setting the Java system property `spark.cleaner.ttl` to the number of seconds you want any metadata to persist. For example, setting `spark.cleaner.ttl` to 600 would cause Spark periodically cleanup all metadata and persisted RDDs that are older than 10 minutes. Note, that this property needs to be set before the SparkContext is created.
 
 This value is closely tied with any window operation that is being used. Any window operation would require the input data to be persisted in memory for at least the duration of the window. Hence it is necessary to set the delay to at least the value of the largest window operation used in the Spark Streaming application. If this delay is set too low, the application will throw an exception saying so.
author	Tathagata Das <tathagata.das1565@gmail.com>	2013-02-23 17:42:26 -0800
committer	Tathagata Das <tathagata.das1565@gmail.com>	2013-02-23 17:42:26 -0800
commit	d853aa9658a87d644d483b1fa9d41c29e3ac0673 (patch)
tree	4a95469875543fd73e795185c335859fd442e71b /docs/streaming-programming-guide.md
parent	41285eaae3642b73b3ac5007a35cc4e8f1d7d084 (diff)
download	spark-d853aa9658a87d644d483b1fa9d41c29e3ac0673.tar.gz spark-d853aa9658a87d644d483b1fa9d41c29e3ac0673.tar.bz2 spark-d853aa9658a87d644d483b1fa9d41c29e3ac0673.zip