[SPARK-12638][API DOC] Parameter explanation not very accurate for rdd function "aggregate"

Currently, RDD function aggregate's parameter doesn't explain well, especially parameter "zeroValue". It's helpful to let junior scala user know that "zeroValue" attend both "seqOp" and "combOp" phase. Author: Tommy YU <tummyyu@163.com> Closes #10587 from Wenpei/rdd_aggregate_doc.
author: Tommy YU <tummyyu@163.com> 2016-01-12 13:20:04 +0000
committer: Sean Owen <sowen@cloudera.com> 2016-01-12 13:20:04 +0000
commit: 9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66 (patch)
tree: 1f4e18e9670ec31574e73a975a1636c66e66e812 /core
parent: 9c7f34af37ef328149c1d66b4689d80a1589e1cc (diff)
download: spark-9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66.tar.gz
spark-9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66.tar.bz2
spark-9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66.zip
1 files changed, 14 insertions, 0 deletions
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index de7102f5b6..53e01a0dbf 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -970,6 +970,13 @@ abstract class RDD[T: ClassTag](
    * apply the fold to each element sequentially in some defined ordering. For functions
    * that are not commutative, the result may differ from that of a fold applied to a
    * non-distributed collection.
+   *
+   * @param zeroValue the initial value for the accumulated result of each partition for the `op`
+   *                  operator, and also the initial value for the combine results from different
+   *                  partitions for the `op` operator - this will typically be the neutral
+   *                  element (e.g. `Nil` for list concatenation or `0` for summation)
+   * @param op an operator used to both accumulate results within a partition and combine results
+   *                  from different partitions
    */
   def fold(zeroValue: T)(op: (T, T) => T): T = withScope {
     // Clone the zero value since we will also be serializing it as part of tasks
@@ -988,6 +995,13 @@ abstract class RDD[T: ClassTag](
    * and one operation for merging two U's, as in scala.TraversableOnce. Both of these functions are
    * allowed to modify and return their first argument instead of creating a new U to avoid memory
    * allocation.
+   *
+   * @param zeroValue the initial value for the accumulated result of each partition for the
+   *                  `seqOp` operator, and also the initial value for the combine results from
+   *                  different partitions for the `combOp` operator - this will typically be the
+   *                  neutral element (e.g. `Nil` for list concatenation or `0` for summation)
+   * @param seqOp an operator used to accumulate results within a partition
+   * @param combOp an associative operator used to combine results from different partitions
    */
   def aggregate[U: ClassTag](zeroValue: U)(seqOp: (U, T) => U, combOp: (U, U) => U): U = withScope {
     // Clone the zero value since we will also be serializing it as part of tasks
author	Tommy YU <tummyyu@163.com>	2016-01-12 13:20:04 +0000
committer	Sean Owen <sowen@cloudera.com>	2016-01-12 13:20:04 +0000
commit	9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66 (patch)
tree	1f4e18e9670ec31574e73a975a1636c66e66e812 /core
parent	9c7f34af37ef328149c1d66b4689d80a1589e1cc (diff)
download	spark-9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66.tar.gz spark-9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66.tar.bz2 spark-9f0995bb0d0bbe5d9b15a1ca9fa18e246ff90d66.zip