diff options
author | Matei Zaharia <matei@eecs.berkeley.edu> | 2013-10-19 23:40:40 -0700 |
---|---|---|
committer | Matei Zaharia <matei@eecs.berkeley.edu> | 2013-10-19 23:40:40 -0700 |
commit | 747f53892546eb8edf1c75788c9ecf9c939d375c (patch) | |
tree | 0cc31f0653e2bac5282f568dc6eab9632c347b3b /core/src/main/java/org | |
parent | 6511bbe2adeb5e361fb3c31bbda245eeb890647a (diff) | |
parent | 7eaa56de7f0253869fa85d4366f1048386af477e (diff) | |
download | spark-747f53892546eb8edf1c75788c9ecf9c939d375c.tar.gz spark-747f53892546eb8edf1c75788c9ecf9c939d375c.tar.bz2 spark-747f53892546eb8edf1c75788c9ecf9c939d375c.zip |
Merge pull request #83 from ewencp/pyspark-accumulator-add-method
Add an add() method to pyspark accumulators.
Add a regular method for adding a term to accumulators in
pyspark. Currently if you have a non-global accumulator, adding to it
is awkward. The += operator can't be used for non-global accumulators
captured via closure because it's involves an assignment. The only way
to do it is using __iadd__ directly.
Adding this method lets you write code like this:
def main():
sc = SparkContext()
accum = sc.accumulator(0)
rdd = sc.parallelize([1,2,3])
def f(x):
accum.add(x)
rdd.foreach(f)
print accum.value
where using accum += x instead would have caused UnboundLocalError
exceptions in workers. Currently it would have to be written as
accum.__iadd__(x).
Diffstat (limited to 'core/src/main/java/org')
0 files changed, 0 insertions, 0 deletions