diff options
author | linbojin <linbojin203@gmail.com> | 2016-08-16 11:37:54 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-08-16 11:37:54 +0100 |
commit | 6f0988b1293a5e5ee3620b2727ed969155d7ac0d (patch) | |
tree | 6589f4f103c58c56d9d5097e41638ca33daa08e1 /licenses | |
parent | 8fdc6ce400f9130399fbdd004df48b3ba95bcd6a (diff) | |
download | spark-6f0988b1293a5e5ee3620b2727ed969155d7ac0d.tar.gz spark-6f0988b1293a5e5ee3620b2727ed969155d7ac0d.tar.bz2 spark-6f0988b1293a5e5ee3620b2727ed969155d7ac0d.zip |
[MINOR][DOC] Correct code snippet results in quick start documentation
## What changes were proposed in this pull request?
As README.md file is updated over time. Some code snippet outputs are not correct based on new README.md file. For example:
```
scala> textFile.count()
res0: Long = 126
```
should be
```
scala> textFile.count()
res0: Long = 99
```
This pr is to add comments to point out this problem so that new spark learners have a correct reference.
Also, fixed a samll bug, inside current documentation, the outputs of linesWithSpark.count() without and with cache are different (one is 15 and the other is 19)
```
scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at filter at <console>:27
scala> textFile.filter(line => line.contains("Spark")).count() // How many lines contain "Spark"?
res3: Long = 15
...
scala> linesWithSpark.cache()
res7: linesWithSpark.type = MapPartitionsRDD[2] at filter at <console>:27
scala> linesWithSpark.count()
res8: Long = 19
```
## How was this patch tested?
manual test: run `$ SKIP_API=1 jekyll serve --watch`
Author: linbojin <linbojin203@gmail.com>
Closes #14645 from linbojin/quick-start-documentation.
Diffstat (limited to 'licenses')
0 files changed, 0 insertions, 0 deletions