SPARK-1727. Correct small compile errors, typos, and markdown issues in (primarly) MLlib docs

While play-testing the Scala and Java code examples in the MLlib docs, I noticed a number of small compile errors, and some typos. This led to finding and fixing a few similar items in other docs. Then in the course of building the site docs to check the result, I found a few small suggestions for the build instructions. I also found a few more formatting and markdown issues uncovered when I accidentally used maruku instead of kramdown. Author: Sean Owen <sowen@cloudera.com> Closes #653 from srowen/SPARK-1727 and squashes the following commits: 6e7c38a [Sean Owen] Final doc updates - one more compile error, and use of mean instead of sum and count 8f5e847 [Sean Owen] Fix markdown syntax issues that maruku flags, even though we use kramdown (but only those that do not affect kramdown's output) 99966a9 [Sean Owen] Update issue tracker URL in docs 23c9ac3 [Sean Owen] Add Scala Naive Bayes example, to use existing example data file (whose format needed a tweak) 8c81982 [Sean Owen] Fix small compile errors and typos across MLlib docs
author: Sean Owen <sowen@cloudera.com> 2014-05-06 20:07:22 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-05-06 20:07:22 -0700
commit: 25ad8f93012730115a8a1fac649fe3e842c045b3 (patch)
tree: 6bc0dfec7014289e39f4c5c9070ed121e00c4398 /docs/mllib-decision-tree.md
parent: a000b5c3b0438c17e9973df4832c320210c29c27 (diff)
download: spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.gz
spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.bz2
spark-25ad8f93012730115a8a1fac649fe3e842c045b3.zip
1 files changed, 4 insertions, 4 deletions
diff --git a/docs/mllib-decision-tree.md b/docs/mllib-decision-tree.md
index 0693766990..296277e58b 100644
--- a/docs/mllib-decision-tree.md
+++ b/docs/mllib-decision-tree.md
@@ -83,19 +83,19 @@ Section 9.2.4 in
 [Elements of Statistical Machine Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/) for
 details). For example, for a binary classification problem with one categorical feature with three
 categories A, B and C with corresponding proportion of label 1 as 0.2, 0.6 and 0.4, the categorical
-features are orded as A followed by C followed B or A, B, C. The two split candidates are A \| C, B
+features are ordered as A followed by C followed B or A, B, C. The two split candidates are A \| C, B
 and A , B \| C where \| denotes the split.
 
 ### Stopping rule
 
 The recursive tree construction is stopped at a node when one of the two conditions is met:
 
-1. The node depth is equal to the `maxDepth` training parammeter
+1. The node depth is equal to the `maxDepth` training parameter
 2. No split candidate leads to an information gain at the node.
 
 ### Practical limitations
 
-1. The tree implementation stores an Array[Double] of size *O(#features \* #splits \* 2^maxDepth)*
+1. The tree implementation stores an `Array[Double]` of size *O(#features \* #splits \* 2^maxDepth)*
    in memory for aggregating histograms over partitions. The current implementation might not scale
    to very deep trees since the memory requirement grows exponentially with tree depth.
 2. The implemented algorithm reads both sparse and dense data. However, it is not optimized for
@@ -178,7 +178,7 @@ val valuesAndPreds = parsedData.map { point =>
   val prediction = model.predict(point.features)
   (point.label, prediction)
 }
-val MSE = valuesAndPreds.map{ case(v, p) => math.pow((v - p), 2)}.reduce(_ + _)/valuesAndPreds.count
+val MSE = valuesAndPreds.map{ case(v, p) => math.pow((v - p), 2)}.mean()
 println("training Mean Squared Error = " + MSE)
 {% endhighlight %}
 </div>
author	Sean Owen <sowen@cloudera.com>	2014-05-06 20:07:22 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-05-06 20:07:22 -0700
commit	25ad8f93012730115a8a1fac649fe3e842c045b3 (patch)
tree	6bc0dfec7014289e39f4c5c9070ed121e00c4398 /docs/mllib-decision-tree.md
parent	a000b5c3b0438c17e9973df4832c320210c29c27 (diff)
download	spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.gz spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.bz2 spark-25ad8f93012730115a8a1fac649fe3e842c045b3.zip