aboutsummaryrefslogtreecommitdiff
path: root/core/src/main/java/org
diff options
context:
space:
mode:
authorMatei Zaharia <matei@eecs.berkeley.edu>2013-10-19 23:40:40 -0700
committerMatei Zaharia <matei@eecs.berkeley.edu>2013-10-19 23:40:40 -0700
commit747f53892546eb8edf1c75788c9ecf9c939d375c (patch)
tree0cc31f0653e2bac5282f568dc6eab9632c347b3b /core/src/main/java/org
parent6511bbe2adeb5e361fb3c31bbda245eeb890647a (diff)
parent7eaa56de7f0253869fa85d4366f1048386af477e (diff)
downloadspark-747f53892546eb8edf1c75788c9ecf9c939d375c.tar.gz
spark-747f53892546eb8edf1c75788c9ecf9c939d375c.tar.bz2
spark-747f53892546eb8edf1c75788c9ecf9c939d375c.zip
Merge pull request #83 from ewencp/pyspark-accumulator-add-method
Add an add() method to pyspark accumulators. Add a regular method for adding a term to accumulators in pyspark. Currently if you have a non-global accumulator, adding to it is awkward. The += operator can't be used for non-global accumulators captured via closure because it's involves an assignment. The only way to do it is using __iadd__ directly. Adding this method lets you write code like this: def main(): sc = SparkContext() accum = sc.accumulator(0) rdd = sc.parallelize([1,2,3]) def f(x): accum.add(x) rdd.foreach(f) print accum.value where using accum += x instead would have caused UnboundLocalError exceptions in workers. Currently it would have to be written as accum.__iadd__(x).
Diffstat (limited to 'core/src/main/java/org')
0 files changed, 0 insertions, 0 deletions