[SPARK-12429][STREAMING][DOC] Add Accumulator and Broadcast example for Streaming

This PR adds Scala, Java and Python examples to show how to use Accumulator and Broadcast in Spark Streaming to support checkpointing. Author: Shixiong Zhu <shixiong@databricks.com> Closes #10385 from zsxwing/accumulator-broadcast-example.
author: Shixiong Zhu <shixiong@databricks.com> 2015-12-22 16:39:10 -0800
committer: Tathagata Das <tathagata.das1565@gmail.com> 2015-12-22 16:39:10 -0800
commit: 20591afd790799327f99485c5a969ed7412eca45 (patch)
tree: dad9877404d7559a53aab11d9a01df342cd17498 /docs/programming-guide.md
parent: 93db50d1c2ff97e6eb9200a995e4601f752968ae (diff)
download: spark-20591afd790799327f99485c5a969ed7412eca45.tar.gz
spark-20591afd790799327f99485c5a969ed7412eca45.tar.bz2
spark-20591afd790799327f99485c5a969ed7412eca45.zip
1 files changed, 3 insertions, 3 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index c5e2a1cd7b..bad25e63e8 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -806,7 +806,7 @@ However, in `cluster` mode, what happens is more complicated, and the above may
 
 What is happening here is that the variables within the closure sent to each executor are now copies and thus, when **counter** is referenced within the `foreach` function, it's no longer the **counter** on the driver node. There is still a **counter** in the memory of the driver node but this is no longer visible to the executors! The executors only see the copy from the serialized closure. Thus, the final value of **counter** will still be zero since all operations on **counter** were referencing the value within the serialized closure.  
 
-To ensure well-defined behavior in these sorts of scenarios one should use an [`Accumulator`](#AccumLink). Accumulators in Spark are used specifically to provide a mechanism for safely updating a variable when execution is split up across worker nodes in a cluster. The Accumulators section of this guide discusses these in more detail.  
+To ensure well-defined behavior in these sorts of scenarios one should use an [`Accumulator`](#accumulators). Accumulators in Spark are used specifically to provide a mechanism for safely updating a variable when execution is split up across worker nodes in a cluster. The Accumulators section of this guide discusses these in more detail.  
 
 In general, closures - constructs like loops or locally defined methods, should not be used to mutate some global state. Spark does not define or guarantee the behavior of mutations to objects referenced from outside of closures. Some code that does this may work in local mode, but that's just by accident and such code will not behave as expected in distributed mode. Use an Accumulator instead if some global aggregation is needed.
 
@@ -1091,7 +1091,7 @@ for details.
 </tr>
 <tr>
   <td> <b>foreach</b>(<i>func</i>) </td>
-  <td> Run a function <i>func</i> on each element of the dataset. This is usually done for side effects such as updating an <a href="#AccumLink">Accumulator</a> or interacting with external storage systems.
+  <td> Run a function <i>func</i> on each element of the dataset. This is usually done for side effects such as updating an <a href="#accumulators">Accumulator</a> or interacting with external storage systems.
   <br /><b>Note</b>: modifying variables other than Accumulators outside of the <code>foreach()</code> may result in undefined behavior. See <a href="#ClosuresLink">Understanding closures </a> for more details.</td>
 </tr>
 </table>
@@ -1338,7 +1338,7 @@ run on the cluster so that `v` is not shipped to the nodes more than once. In ad
 `v` should not be modified after it is broadcast in order to ensure that all nodes get the same
 value of the broadcast variable (e.g. if the variable is shipped to a new node later).
 
-## Accumulators <a name="AccumLink"></a>
+## Accumulators
 
 Accumulators are variables that are only "added" to through an associative operation and can
 therefore be efficiently supported in parallel. They can be used to implement counters (as in
author	Shixiong Zhu <shixiong@databricks.com>	2015-12-22 16:39:10 -0800
committer	Tathagata Das <tathagata.das1565@gmail.com>	2015-12-22 16:39:10 -0800
commit	20591afd790799327f99485c5a969ed7412eca45 (patch)
tree	dad9877404d7559a53aab11d9a01df342cd17498 /docs/programming-guide.md
parent	93db50d1c2ff97e6eb9200a995e4601f752968ae (diff)
download	spark-20591afd790799327f99485c5a969ed7412eca45.tar.gz spark-20591afd790799327f99485c5a969ed7412eca45.tar.bz2 spark-20591afd790799327f99485c5a969ed7412eca45.zip