aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/streaming/dstream.py
diff options
context:
space:
mode:
authorSean Owen <sowen@cloudera.com>2016-02-19 10:26:38 +0000
committerSean Owen <sowen@cloudera.com>2016-02-19 10:26:38 +0000
commitfb7e21797ed618d9754545a44f8f95f75b66757a (patch)
tree42b592cf1f25aeaf067c35afd75f9a3403182b99 /python/pyspark/streaming/dstream.py
parentc776fce99b496a789ffcf2cfab78cf51eeea032b (diff)
downloadspark-fb7e21797ed618d9754545a44f8f95f75b66757a.tar.gz
spark-fb7e21797ed618d9754545a44f8f95f75b66757a.tar.bz2
spark-fb7e21797ed618d9754545a44f8f95f75b66757a.zip
[SPARK-13339][DOCS] Clarify commutative / associative operator requirements for reduce, fold
Clarify that reduce functions need to be commutative, and fold functions do not See https://github.com/apache/spark/pull/11091 Author: Sean Owen <sowen@cloudera.com> Closes #11217 from srowen/SPARK-13339.
Diffstat (limited to 'python/pyspark/streaming/dstream.py')
-rw-r--r--python/pyspark/streaming/dstream.py4
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/streaming/dstream.py b/python/pyspark/streaming/dstream.py
index 86447f5e58..2056663872 100644
--- a/python/pyspark/streaming/dstream.py
+++ b/python/pyspark/streaming/dstream.py
@@ -453,7 +453,7 @@ class DStream(object):
2. "inverse reduce" the old values that left the window (e.g., subtracting old counts)
This is more efficient than `invReduceFunc` is None.
- @param reduceFunc: associative reduce function
+ @param reduceFunc: associative and commutative reduce function
@param invReduceFunc: inverse reduce function of `reduceFunc`
@param windowDuration: width of the window; must be a multiple of this DStream's
batching interval
@@ -524,7 +524,7 @@ class DStream(object):
`invFunc` can be None, then it will reduce all the RDDs in window, could be slower
than having `invFunc`.
- @param func: associative reduce function
+ @param func: associative and commutative reduce function
@param invFunc: inverse function of `reduceFunc`
@param windowDuration: width of the window; must be a multiple of this DStream's
batching interval