aboutsummaryrefslogtreecommitdiff
path: root/docs/programming-guide.md
diff options
context:
space:
mode:
authorMadhu Siddalingaiah <madhu@madhu.com>2014-12-01 08:45:34 -0800
committerJosh Rosen <joshrosen@databricks.com>2014-12-01 08:45:34 -0800
commit2b233f5fc4beb2c6ed4bc142e923e96f8bad3ec4 (patch)
treee820137691ec4b188a5d3c8c55cfe8fe91783989 /docs/programming-guide.md
parent30a86acdefd5428af6d6264f59a037e0eefd74b4 (diff)
downloadspark-2b233f5fc4beb2c6ed4bc142e923e96f8bad3ec4.tar.gz
spark-2b233f5fc4beb2c6ed4bc142e923e96f8bad3ec4.tar.bz2
spark-2b233f5fc4beb2c6ed4bc142e923e96f8bad3ec4.zip
Documentation: add description for repartitionAndSortWithinPartitions
Author: Madhu Siddalingaiah <madhu@madhu.com> Closes #3390 from msiddalingaiah/master and squashes the following commits: cbccbfe [Madhu Siddalingaiah] Documentation: replace <b> with <code> (again) 332f7a2 [Madhu Siddalingaiah] Documentation: replace <b> with <code> cd2b05a [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' 0fc12d7 [Madhu Siddalingaiah] Documentation: add description for repartitionAndSortWithinPartitions
Diffstat (limited to 'docs/programming-guide.md')
-rw-r--r--docs/programming-guide.md6
1 files changed, 6 insertions, 0 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 7a16ee8742..5e0d5c15d7 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -934,6 +934,12 @@ for details.
<td> Reshuffle the data in the RDD randomly to create either more or fewer partitions and balance it across them.
This always shuffles all data over the network. </td>
</tr>
+<tr>
+ <td> <b>repartitionAndSortWithinPartitions</b>(<i>partitioner</i>) </td>
+ <td> Repartition the RDD according to the given partitioner and, within each resulting partition,
+ sort records by their keys. This is more efficient than calling <code>repartition</code> and then sorting within
+ each partition because it can push the sorting down into the shuffle machinery. </td>
+</tr>
</table>
### Actions