aboutsummaryrefslogtreecommitdiff
path: root/streaming
diff options
context:
space:
mode:
authorzsxwing <zsxwing@gmail.com>2014-11-19 13:17:15 -0800
committerTathagata Das <tathagata.das1565@gmail.com>2014-11-19 13:17:15 -0800
commit3bf7ceebb10741a8b32e0c00f0edfd3a222ec5cd (patch)
tree97b108e3a263289eda96dfbc71c80adfd3dd2f39 /streaming
parent22fc4e751c0a2f0ff39e42aa0a8fb9459d7412ec (diff)
downloadspark-3bf7ceebb10741a8b32e0c00f0edfd3a222ec5cd.tar.gz
spark-3bf7ceebb10741a8b32e0c00f0edfd3a222ec5cd.tar.bz2
spark-3bf7ceebb10741a8b32e0c00f0edfd3a222ec5cd.zip
[SPARK-4481][Streaming][Doc] Fix the wrong description of updateFunc
Removed `If `this` function returns None, then corresponding state key-value pair will be eliminated.` for the description of `updateFunc: (Iterator[(K, Seq[V], Option[S])]) => Iterator[(K, S)]` Author: zsxwing <zsxwing@gmail.com> Closes #3356 from zsxwing/SPARK-4481 and squashes the following commits: 76a9891 [zsxwing] Add a note that keys may be added or removed 0ebc42a [zsxwing] Fix the wrong description of updateFunc
Diffstat (limited to 'streaming')
-rw-r--r--streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala16
1 files changed, 7 insertions, 9 deletions
diff --git a/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala b/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala
index b39f47f04a..3f03f42270 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala
@@ -398,10 +398,9 @@ class PairDStreamFunctions[K, V](self: DStream[(K,V)])
* Return a new "state" DStream where the state for each key is updated by applying
* the given function on the previous state of the key and the new values of each key.
* org.apache.spark.Partitioner is used to control the partitioning of each RDD.
- * @param updateFunc State update function. If `this` function returns None, then
- * corresponding state key-value pair will be eliminated. Note, that
- * this function may generate a different a tuple with a different key
- * than the input key. It is up to the developer to decide whether to
+ * @param updateFunc State update function. Note, that this function may generate a different
+ * tuple with a different key than the input key. Therefore keys may be removed
+ * or added in this way. It is up to the developer to decide whether to
* remember the partitioner despite the key being changed.
* @param partitioner Partitioner for controlling the partitioning of each RDD in the new
* DStream
@@ -442,11 +441,10 @@ class PairDStreamFunctions[K, V](self: DStream[(K,V)])
* Return a new "state" DStream where the state for each key is updated by applying
* the given function on the previous state of the key and the new values of each key.
* org.apache.spark.Partitioner is used to control the partitioning of each RDD.
- * @param updateFunc State update function. If `this` function returns None, then
- * corresponding state key-value pair will be eliminated. Note, that
- * this function may generate a different a tuple with a different key
- * than the input key. It is up to the developer to decide whether to
- * remember the partitioner despite the key being changed.
+ * @param updateFunc State update function. Note, that this function may generate a different
+ * tuple with a different key than the input key. Therefore keys may be removed
+ * or added in this way. It is up to the developer to decide whether to
+ * remember the partitioner despite the key being changed.
* @param partitioner Partitioner for controlling the partitioning of each RDD in the new
* DStream
* @param rememberPartitioner Whether to remember the paritioner object in the generated RDDs.