diff options
author | Sean Owen <sowen@cloudera.com> | 2016-01-19 09:34:49 +0000 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-01-19 09:34:49 +0000 |
commit | d8c4b00a234514cc3a877e3daed5d1378a2637e8 (patch) | |
tree | 41446b6b63422d682d34b3d4e6fa505dc73d72ed /python/pyspark/rdd.py | |
parent | c00744e60f77edb238aff1e30b450dca65451e91 (diff) | |
download | spark-d8c4b00a234514cc3a877e3daed5d1378a2637e8.tar.gz spark-d8c4b00a234514cc3a877e3daed5d1378a2637e8.tar.bz2 spark-d8c4b00a234514cc3a877e3daed5d1378a2637e8.zip |
[SPARK-7683][PYSPARK] Confusing behavior of fold function of RDD in pyspark
Fix order of arguments that Pyspark RDD.fold passes to its op - should be (acc, obj) like other implementations.
Obviously, this is a potentially breaking change, so can only happen for 2.x
CC davies
Author: Sean Owen <sowen@cloudera.com>
Closes #10771 from srowen/SPARK-7683.
Diffstat (limited to 'python/pyspark/rdd.py')
-rw-r--r-- | python/pyspark/rdd.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/python/pyspark/rdd.py b/python/pyspark/rdd.py index a019c05862..c285946254 100644 --- a/python/pyspark/rdd.py +++ b/python/pyspark/rdd.py @@ -861,7 +861,7 @@ class RDD(object): def func(iterator): acc = zeroValue for obj in iterator: - acc = op(obj, acc) + acc = op(acc, obj) yield acc # collecting result of mapPartitions here ensures that the copy of # zeroValue provided to each partition is unique from the one provided |