[SPARK-13339][DOCS] Clarify commutative / associative operator requirements for reduce, fold

Clarify that reduce functions need to be commutative, and fold functions do not See https://github.com/apache/spark/pull/11091 Author: Sean Owen <sowen@cloudera.com> Closes #11217 from srowen/SPARK-13339.
author: Sean Owen <sowen@cloudera.com> 2016-02-19 10:26:38 +0000
committer: Sean Owen <sowen@cloudera.com> 2016-02-19 10:26:38 +0000
commit: fb7e21797ed618d9754545a44f8f95f75b66757a (patch)
tree: 42b592cf1f25aeaf067c35afd75f9a3403182b99 /docs
parent: c776fce99b496a789ffcf2cfab78cf51eeea032b (diff)
download: spark-fb7e21797ed618d9754545a44f8f95f75b66757a.tar.gz
spark-fb7e21797ed618d9754545a44f8f95f75b66757a.tar.bz2
spark-fb7e21797ed618d9754545a44f8f95f75b66757a.zip
2 files changed, 3 insertions, 3 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index e45081464a..2d6f7767d9 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -1343,7 +1343,7 @@ value of the broadcast variable (e.g. if the variable is shipped to a new node l
 
 ## Accumulators
 
-Accumulators are variables that are only "added" to through an associative operation and can
+Accumulators are variables that are only "added" to through an associative and commutative operation and can
 therefore be efficiently supported in parallel. They can be used to implement counters (as in
 MapReduce) or sums. Spark natively supports accumulators of numeric types, and programmers
 can add support for new types. If accumulators are created with a name, they will be
diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md
index 677f5ff7be..4d1932bc8c 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -798,7 +798,7 @@ Some of the common ones are as follows.
   <td> <b>reduce</b>(<i>func</i>) </td>
   <td> Return a new DStream of single-element RDDs by aggregating the elements in each RDD of the
   source DStream using a function <i>func</i> (which takes two arguments and returns one).
-  The function should be associative so that it can be computed in parallel. </td>
+  The function should be associative and commutative so that it can be computed in parallel. </td>
 </tr>
 <tr>
   <td> <b>countByValue</b>() </td>
@@ -1072,7 +1072,7 @@ said two parameters - <i>windowLength</i> and <i>slideInterval</i>.
 <tr>
   <td> <b>reduceByWindow</b>(<i>func</i>, <i>windowLength</i>, <i>slideInterval</i>) </td>
   <td> Return a new single-element stream, created by aggregating elements in the stream over a
-  sliding interval using <i>func</i>. The function should be associative so that it can be computed
+  sliding interval using <i>func</i>. The function should be associative and commutative so that it can be computed
   correctly in parallel.
   </td>
 </tr>
author	Sean Owen <sowen@cloudera.com>	2016-02-19 10:26:38 +0000
committer	Sean Owen <sowen@cloudera.com>	2016-02-19 10:26:38 +0000
commit	fb7e21797ed618d9754545a44f8f95f75b66757a (patch)
tree	42b592cf1f25aeaf067c35afd75f9a3403182b99 /docs
parent	c776fce99b496a789ffcf2cfab78cf51eeea032b (diff)
download	spark-fb7e21797ed618d9754545a44f8f95f75b66757a.tar.gz spark-fb7e21797ed618d9754545a44f8f95f75b66757a.tar.bz2 spark-fb7e21797ed618d9754545a44f8f95f75b66757a.zip