[SPARK-2013] Documentation for saveAsPickleFile and pickleFile in Python

Author: Kan Zhang <kzhang@apache.org> Closes #983 from kanzhang/SPARK-2013 and squashes the following commits: 0e128bb [Kan Zhang] [SPARK-2013] minor update e728516 [Kan Zhang] [SPARK-2013] Documentation for saveAsPickleFile and pickleFile in Python (cherry picked from commit b52603b039cdfa0f8e58ef3c6229d79e732ffc58) Signed-off-by: Reynold Xin <rxin@apache.org> Conflicts: docs/programming-guide.md
author: Kan Zhang <kzhang@apache.org> 2014-06-14 13:22:30 -0700
committer: Reynold Xin <rxin@apache.org> 2014-06-14 13:36:21 -0700
commit: 05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612 (patch)
tree: 8c7e05c869d57307683004f8bdd603769a38215e /docs
parent: b1a7e99fe1fa89afb0e83c46a388b009037ec37d (diff)
download: spark-05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612.tar.gz
spark-05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612.tar.bz2
spark-05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612.zip
1 files changed, 4 insertions, 2 deletions
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 7d77e640d0..b667aa0d17 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -379,10 +379,12 @@ Some notes on reading files with Spark:
 * The `textFile` method also takes an optional second argument for controlling the number of slices of the file. By default, Spark creates one slice for each block of the file (blocks being 64MB by default in HDFS), but you can also ask for a higher number of slices by passing a larger value. Note that you cannot have fewer slices than blocks.
 
 Apart reading files as a collection of lines,
-`SparkContext.wholeTextFiles` lets you read a directory containing multiple small text files, and returns each of them as (filename, content) pairs. This is in contrast with `textFile`, which would return one record per line in each file.
 
-</div>
+* `SparkContext.wholeTextFiles` lets you read a directory containing multiple small text files, and returns each of them as (filename, content) pairs. This is in contrast with `textFile`, which would return one record per line in each file.
 
+* `RDD.saveAsPickleFile` and `SparkContext.pickleFile` support saving an RDD in a simple format consisting of pickled Python objects. Batching is used on pickle serialization, with default batch size 10.
+
+</div>
 
 </div>
author	Kan Zhang <kzhang@apache.org>	2014-06-14 13:22:30 -0700
committer	Reynold Xin <rxin@apache.org>	2014-06-14 13:36:21 -0700
commit	05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612 (patch)
tree	8c7e05c869d57307683004f8bdd603769a38215e /docs
parent	b1a7e99fe1fa89afb0e83c46a388b009037ec37d (diff)
download	spark-05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612.tar.gz spark-05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612.tar.bz2 spark-05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612.zip