aboutsummaryrefslogtreecommitdiff
path: root/docs/programming-guide.md
diff options
context:
space:
mode:
authorSean Owen <sowen@cloudera.com>2014-09-30 11:15:38 -0700
committerMatei Zaharia <matei@databricks.com>2014-09-30 11:15:38 -0700
commitab6dd80ba0f7e1042ea270d10400109a467fe40e (patch)
treef18c310909d5abdad8c78f7d1957358d8af8fce6 /docs/programming-guide.md
parent157e7d0f62eaf016a0c3749065ddcec170540a36 (diff)
downloadspark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.tar.gz
spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.tar.bz2
spark-ab6dd80ba0f7e1042ea270d10400109a467fe40e.zip
[SPARK-3356] [DOCS] Document when RDD elements' ordering within partitions is nondeterministic
As suggested by mateiz , and because it came up on the mailing list again last week, this attempts to document that ordering of elements is not guaranteed across RDD evaluations in groupBy, zip, and partition-wise RDD methods. Suggestions welcome about the wording, or other methods that need a note. Author: Sean Owen <sowen@cloudera.com> Closes #2508 from srowen/SPARK-3356 and squashes the following commits: b7c96fd [Sean Owen] Undo change to programming guide ad4aeec [Sean Owen] Don't mention ordering in partition-wise methods, reword description of ordering for zip methods per review, and add similar note to programming guide, which mentions groupByKey (but not zip methods) fce943b [Sean Owen] Note that ordering of elements is not guaranteed across RDD evaluations in groupBy, zip, and partition-wise RDD methods
Diffstat (limited to 'docs/programming-guide.md')
-rw-r--r--docs/programming-guide.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 510b47a2aa..1d61a3c555 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -883,7 +883,7 @@ for details.
<tr>
<td> <b>groupByKey</b>([<i>numTasks</i>]) </td>
<td> When called on a dataset of (K, V) pairs, returns a dataset of (K, Iterable&lt;V&gt;) pairs. <br />
- <b>Note:</b> If you are grouping in order to perform an aggregation (such as a sum or
+ <b>Note:</b> If you are grouping in order to perform an aggregation (such as a sum or
average) over each key, using <code>reduceByKey</code> or <code>combineByKey</code> will yield much better
performance.
<br />