diff options
author | gasparms <gmunoz@stratio.com> | 2015-02-14 20:10:29 +0000 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2015-02-14 20:10:29 +0000 |
commit | f80e2629bb74bc62960c61ff313f7e7802d61319 (patch) | |
tree | dda0edc8c3b7043898e3d7b03fd9124e61c8778c /docs | |
parent | e98dfe627c5d0201464cdd0f363f391ea84c389a (diff) | |
download | spark-f80e2629bb74bc62960c61ff313f7e7802d61319.tar.gz spark-f80e2629bb74bc62960c61ff313f7e7802d61319.tar.bz2 spark-f80e2629bb74bc62960c61ff313f7e7802d61319.zip |
[SPARK-5800] Streaming Docs. Change linked files according the selected language
Currently, Spark Streaming Programming Guide after updateStateByKey explanation links to file stateful_network_wordcount.py and note "For the complete Scala code ..." for any language tab selected. This is an incoherence.
I've changed the guide and link its pertinent example file. JavaStatefulNetworkWordCount.java example was not created so I added to the commit.
Author: gasparms <gmunoz@stratio.com>
Closes #4589 from gasparms/feature/streaming-guide and squashes the following commits:
7f37f89 [gasparms] More style changes
ec202b0 [gasparms] Follow spark style guide
f527328 [gasparms] Improve example to look like scala example
4d8785c [gasparms] Remove throw exception
e92e6b8 [gasparms] Fix incoherence
92db405 [gasparms] Fix Streaming Programming Guide. Change files according the selected language
Diffstat (limited to 'docs')
-rw-r--r-- | docs/streaming-programming-guide.md | 21 |
1 files changed, 17 insertions, 4 deletions
diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md index 96fb12ce5e..997de9511c 100644 --- a/docs/streaming-programming-guide.md +++ b/docs/streaming-programming-guide.md @@ -878,6 +878,12 @@ This is applied on a DStream containing words (say, the `pairs` DStream containi val runningCounts = pairs.updateStateByKey[Int](updateFunction _) {% endhighlight %} +The update function will be called for each word, with `newValues` having a sequence of 1's (from +the `(word, 1)` pairs) and the `runningCount` having the previous count. For the complete +Scala code, take a look at the example +[StatefulNetworkWordCount.scala]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/scala/org/apache +/spark/examples/streaming/StatefulNetworkWordCount.scala). + </div> <div data-lang="java" markdown="1"> @@ -899,6 +905,13 @@ This is applied on a DStream containing words (say, the `pairs` DStream containi JavaPairDStream<String, Integer> runningCounts = pairs.updateStateByKey(updateFunction); {% endhighlight %} +The update function will be called for each word, with `newValues` having a sequence of 1's (from +the `(word, 1)` pairs) and the `runningCount` having the previous count. For the complete +Java code, take a look at the example +[JavaStatefulNetworkWordCount.java]({{site +.SPARK_GITHUB_URL}}/blob/master/examples/src/main/java/org/apache/spark/examples/streaming +/JavaStatefulNetworkWordCount.java). + </div> <div data-lang="python" markdown="1"> @@ -916,14 +929,14 @@ This is applied on a DStream containing words (say, the `pairs` DStream containi runningCounts = pairs.updateStateByKey(updateFunction) {% endhighlight %} -</div> -</div> - The update function will be called for each word, with `newValues` having a sequence of 1's (from the `(word, 1)` pairs) and the `runningCount` having the previous count. For the complete -Scala code, take a look at the example +Python code, take a look at the example [stateful_network_wordcount.py]({{site.SPARK_GITHUB_URL}}/blob/master/examples/src/main/python/streaming/stateful_network_wordcount.py). +</div> +</div> + Note that using `updateStateByKey` requires the checkpoint directory to be configured, which is discussed in detail in the [checkpointing](#checkpointing) section. |