aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/streaming/dstream.py
diff options
context:
space:
mode:
authorFrançois Garillot <francois@garillot.net>2016-05-03 11:42:47 -0700
committerShixiong Zhu <shixiong@databricks.com>2016-05-03 11:42:47 -0700
commit439e361010e51d2213c92ccabed5093be92a72ee (patch)
tree40a965988a20a4e9228bd0185f7df1695e6b71b0 /python/pyspark/streaming/dstream.py
parentca813330c716bed76ac0034c12f56665960a1105 (diff)
downloadspark-439e361010e51d2213c92ccabed5093be92a72ee.tar.gz
spark-439e361010e51d2213c92ccabed5093be92a72ee.tar.bz2
spark-439e361010e51d2213c92ccabed5093be92a72ee.zip
[SPARK-9819][STREAMING][DOCUMENTATION] Clarify doc for invReduceFunc in incremental versions of reduceByWindow
- that reduceFunc and invReduceFunc should be associative - that the intermediate result in iterated applications of inverseReduceFunc is its first argument Author: François Garillot <francois@garillot.net> Closes #8103 from huitseeker/issue/invReduceFuncDoc.
Diffstat (limited to 'python/pyspark/streaming/dstream.py')
-rw-r--r--python/pyspark/streaming/dstream.py4
1 files changed, 3 insertions, 1 deletions
diff --git a/python/pyspark/streaming/dstream.py b/python/pyspark/streaming/dstream.py
index 2056663872..67a0819601 100644
--- a/python/pyspark/streaming/dstream.py
+++ b/python/pyspark/streaming/dstream.py
@@ -454,7 +454,9 @@ class DStream(object):
This is more efficient than `invReduceFunc` is None.
@param reduceFunc: associative and commutative reduce function
- @param invReduceFunc: inverse reduce function of `reduceFunc`
+ @param invReduceFunc: inverse reduce function of `reduceFunc`; such that for all y,
+ and invertible x:
+ `invReduceFunc(reduceFunc(x, y), x) = y`
@param windowDuration: width of the window; must be a multiple of this DStream's
batching interval
@param slideDuration: sliding interval of the window (i.e., the interval after which