[SPARK-3356] [DOCS] Document when RDD elements' ordering within partitions is nondeterministic

As suggested by mateiz , and because it came up on the mailing list again last week, this attempts to document that ordering of elements is not guaranteed across RDD evaluations in groupBy, zip, and partition-wise RDD methods. Suggestions welcome about the wording, or other methods that need a note. Author: Sean Owen <sowen@cloudera.com> Closes #2508 from srowen/SPARK-3356 and squashes the following commits: b7c96fd [Sean Owen] Undo change to programming guide ad4aeec [Sean Owen] Don't mention ordering in partition-wise methods, reword description of ordering for zip methods per review, and add similar note to programming guide, which mentions groupByKey (but not zip methods) fce943b [Sean Owen] Note that ordering of elements is not guaranteed across RDD evaluations in groupBy, zip, and partition-wise RDD methods
author: Sean Owen <sowen@cloudera.com> 2014-09-30 11:15:38 -0700
committer: Matei Zaharia <matei@databricks.com> 2014-09-30 11:15:38 -0700
commit: ab6dd80ba0f7e1042ea270d10400109a467fe40e (patch)
tree: f18c310909d5abdad8c78f7d1957358d8af8fce6 /docs/programming-guide.md
parent: 157e7d0f62eaf016a0c3749065ddcec170540a36 (diff)
download: spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.tar.gz
spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.tar.bz2
spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 510b47a2aa..1d61a3c555 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -883,7 +883,7 @@ for details.
 <tr>
   <td> <b>groupByKey</b>([<i>numTasks</i>]) </td>
   <td> When called on a dataset of (K, V) pairs, returns a dataset of (K, Iterable&lt;V&gt;) pairs. <br />
-    <b>Note:</b> If you are grouping in order to perform an aggregation (such as a sum or 
+    <b>Note:</b> If you are grouping in order to perform an aggregation (such as a sum or
       average) over each key, using <code>reduceByKey</code> or <code>combineByKey</code> will yield much better 
       performance.
     <br />
author	Sean Owen <sowen@cloudera.com>	2014-09-30 11:15:38 -0700
committer	Matei Zaharia <matei@databricks.com>	2014-09-30 11:15:38 -0700
commit	ab6dd80ba0f7e1042ea270d10400109a467fe40e (patch)
tree	f18c310909d5abdad8c78f7d1957358d8af8fce6 /docs/programming-guide.md
parent	157e7d0f62eaf016a0c3749065ddcec170540a36 (diff)
download	spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.tar.gz spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.tar.bz2 spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.zip