aboutsummaryrefslogtreecommitdiff
path: root/docs/programming-guide.md
diff options
context:
space:
mode:
authorJacek Laskowski <jacek@japila.pl>2016-04-24 10:36:33 +0100
committerSean Owen <sowen@cloudera.com>2016-04-24 10:36:33 +0100
commit8df8a81825709dbefe5aecd7642748c1b3a38e99 (patch)
tree44728ee1510436ad5d16333774cfbca881dcedc4 /docs/programming-guide.md
parentdb7113b1d37e86253d8584b88ed66672f3620254 (diff)
downloadspark-8df8a81825709dbefe5aecd7642748c1b3a38e99.tar.gz
spark-8df8a81825709dbefe5aecd7642748c1b3a38e99.tar.bz2
spark-8df8a81825709dbefe5aecd7642748c1b3a38e99.zip
[DOCS][MINOR] Screenshot + minor fixes to improve reading for accumulators
## What changes were proposed in this pull request? Added screenshot + minor fixes to improve reading ## How was this patch tested? Manual Author: Jacek Laskowski <jacek@japila.pl> Closes #12569 from jaceklaskowski/docs-accumulators.
Diffstat (limited to 'docs/programming-guide.md')
-rw-r--r--docs/programming-guide.md18
1 files changed, 12 insertions, 6 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 2f0ed5eca2..f398e38fbb 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -1328,12 +1328,18 @@ value of the broadcast variable (e.g. if the variable is shipped to a new node l
Accumulators are variables that are only "added" to through an associative and commutative operation and can
therefore be efficiently supported in parallel. They can be used to implement counters (as in
MapReduce) or sums. Spark natively supports accumulators of numeric types, and programmers
-can add support for new types. If accumulators are created with a name, they will be
+can add support for new types.
+
+If accumulators are created with a name, they will be
displayed in Spark's UI. This can be useful for understanding the progress of
running stages (NOTE: this is not yet supported in Python).
+<p style="text-align: center;">
+ <img src="img/spark-webui-accumulators.png" title="Accumulators in the Spark UI" alt="Accumulators in the Spark UI" />
+</p>
+
An accumulator is created from an initial value `v` by calling `SparkContext.accumulator(v)`. Tasks
-running on the cluster can then add to it using the `add` method or the `+=` operator (in Scala and Python).
+running on a cluster can then add to it using the `add` method or the `+=` operator (in Scala and Python).
However, they cannot read its value.
Only the driver program can read the accumulator's value, using its `value` method.
@@ -1345,7 +1351,7 @@ The code below shows an accumulator being used to add up the elements of an arra
{% highlight scala %}
scala> val accum = sc.accumulator(0, "My Accumulator")
-accum: spark.Accumulator[Int] = 0
+accum: org.apache.spark.Accumulator[Int] = 0
scala> sc.parallelize(Array(1, 2, 3, 4)).foreach(x => accum += x)
...
@@ -1466,11 +1472,11 @@ Accumulators do not change the lazy evaluation model of Spark. If they are being
<div class="codetabs">
-<div data-lang="scala" markdown="1">
+<div data-lang="scala" markdown="1">
{% highlight scala %}
val accum = sc.accumulator(0)
-data.map { x => accum += x; f(x) }
-// Here, accum is still 0 because no actions have caused the <code>map</code> to be computed.
+data.map { x => accum += x; x }
+// Here, accum is still 0 because no actions have caused the map operation to be computed.
{% endhighlight %}
</div>