diff options
author | likun <jacky.likun@huawei.com> | 2014-10-17 10:33:45 -0700 |
---|---|---|
committer | Andrew Or <andrewor14@gmail.com> | 2014-10-17 10:33:45 -0700 |
commit | c351862064ed7d2031ea4c8bf33881e5f702ea0a (patch) | |
tree | ab4afa5214bfd7cea6865c712836ea9ac4024104 | |
parent | e678b9f02a2936b35c95e91a5f0ff388b5720261 (diff) | |
download | spark-c351862064ed7d2031ea4c8bf33881e5f702ea0a.tar.gz spark-c351862064ed7d2031ea4c8bf33881e5f702ea0a.tar.bz2 spark-c351862064ed7d2031ea4c8bf33881e5f702ea0a.zip |
[SPARK-3935][Core] log the number of records that has been written
There is a unused variable(count) in saveAsHadoopDataset in PairRDDFunctions.scala. The initial idea of this variable seems to count the number of records, so I am adding a log statement to log the number of records that has been written to the writer.
Author: likun <jacky.likun@huawei.com>
Author: jackylk <jacky.likun@huawei.com>
Closes #2791 from jackylk/SPARK-3935 and squashes the following commits:
a874047 [jackylk] removing the unused variable in PairRddFunctions.scala
3bf43c7 [likun] log the number of records has been written
-rw-r--r-- | core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala | 2 |
1 files changed, 0 insertions, 2 deletions
diff --git a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala index 929ded58a3..ac96de86dd 100644 --- a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala +++ b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala @@ -1032,10 +1032,8 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)]) writer.setup(context.stageId, context.partitionId, attemptNumber) writer.open() try { - var count = 0 while (iter.hasNext) { val record = iter.next() - count += 1 writer.write(record._1.asInstanceOf[AnyRef], record._2.asInstanceOf[AnyRef]) } } finally { |