SPARK-1727. Correct small compile errors, typos, and markdown issues in (primarly) MLlib docs

While play-testing the Scala and Java code examples in the MLlib docs, I noticed a number of small compile errors, and some typos. This led to finding and fixing a few similar items in other docs. Then in the course of building the site docs to check the result, I found a few small suggestions for the build instructions. I also found a few more formatting and markdown issues uncovered when I accidentally used maruku instead of kramdown. Author: Sean Owen <sowen@cloudera.com> Closes #653 from srowen/SPARK-1727 and squashes the following commits: 6e7c38a [Sean Owen] Final doc updates - one more compile error, and use of mean instead of sum and count 8f5e847 [Sean Owen] Fix markdown syntax issues that maruku flags, even though we use kramdown (but only those that do not affect kramdown's output) 99966a9 [Sean Owen] Update issue tracker URL in docs 23c9ac3 [Sean Owen] Add Scala Naive Bayes example, to use existing example data file (whose format needed a tweak) 8c81982 [Sean Owen] Fix small compile errors and typos across MLlib docs
author: Sean Owen <sowen@cloudera.com> 2014-05-06 20:07:22 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-05-06 20:07:22 -0700
commit: 25ad8f93012730115a8a1fac649fe3e842c045b3 (patch)
tree: 6bc0dfec7014289e39f4c5c9070ed121e00c4398 /docs/mllib-basics.md
parent: a000b5c3b0438c17e9973df4832c320210c29c27 (diff)
download: spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.gz
spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.bz2
spark-25ad8f93012730115a8a1fac649fe3e842c045b3.zip
1 files changed, 9 insertions, 5 deletions
diff --git a/docs/mllib-basics.md b/docs/mllib-basics.md
index 710ce1721f..704308802d 100644
--- a/docs/mllib-basics.md
+++ b/docs/mllib-basics.md
@@ -9,7 +9,7 @@ title: <a href="mllib-guide.html">MLlib</a> - Basics
 MLlib supports local vectors and matrices stored on a single machine, 
 as well as distributed matrices backed by one or more RDDs.
 In the current implementation, local vectors and matrices are simple data models 
-to serve public interfaces. The underly linear algebra operations are provided by
+to serve public interfaces. The underlying linear algebra operations are provided by
 [Breeze](http://www.scalanlp.org/) and [jblas](http://jblas.org/).
 A training example used in supervised learning is called "labeled point" in MLlib.
 
@@ -205,7 +205,7 @@ import org.apache.spark.mllib.regression.LabeledPoint;
 import org.apache.spark.mllib.util.MLUtils;
 import org.apache.spark.rdd.RDDimport;
 
-RDD[LabeledPoint] training = MLUtils.loadLibSVMData(sc, "mllib/data/sample_libsvm_data.txt")
+RDD<LabeledPoint> training = MLUtils.loadLibSVMData(jsc, "mllib/data/sample_libsvm_data.txt");
 {% endhighlight %}
 </div>
 </div>
@@ -307,6 +307,7 @@ A [`RowMatrix`](api/mllib/index.html#org.apache.spark.mllib.linalg.distributed.R
 created from a `JavaRDD<Vector>` instance.  Then we can compute its column summary statistics.
 
 {% highlight java %}
+import org.apache.spark.api.java.JavaRDD;
 import org.apache.spark.mllib.linalg.Vector;
 import org.apache.spark.mllib.linalg.distributed.RowMatrix;
 
@@ -348,10 +349,10 @@ val mat: RowMatrix = ... // a RowMatrix
 val summary: MultivariateStatisticalSummary = mat.computeColumnSummaryStatistics()
 println(summary.mean) // a dense vector containing the mean value for each column
 println(summary.variance) // column-wise variance
-println(summary.numNonzers) // number of nonzeros in each column
+println(summary.numNonzeros) // number of nonzeros in each column
 
 // Compute the covariance matrix.
-val Cov: Matrix = mat.computeCovariance()
+val cov: Matrix = mat.computeCovariance()
 {% endhighlight %}
 </div>
 </div>
@@ -397,11 +398,12 @@ wrapper over `(long, Vector)`.  An `IndexedRowMatrix` can be converted to a `Row
 its row indices.
 
 {% highlight java %}
+import org.apache.spark.api.java.JavaRDD;
 import org.apache.spark.mllib.linalg.distributed.IndexedRow;
 import org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix;
 import org.apache.spark.mllib.linalg.distributed.RowMatrix;
 
-JavaRDD[IndexedRow] rows = ... // a JavaRDD of indexed rows
+JavaRDD<IndexedRow> rows = ... // a JavaRDD of indexed rows
 // Create an IndexedRowMatrix from a JavaRDD<IndexedRow>.
 IndexedRowMatrix mat = new IndexedRowMatrix(rows.rdd());
 
@@ -458,7 +460,9 @@ wrapper over `(long, long, double)`.  A `CoordinateMatrix` can be converted to a
 with sparse rows by calling `toIndexedRowMatrix`.
 
 {% highlight scala %}
+import org.apache.spark.api.java.JavaRDD;
 import org.apache.spark.mllib.linalg.distributed.CoordinateMatrix;
+import org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix;
 import org.apache.spark.mllib.linalg.distributed.MatrixEntry;
 
 JavaRDD<MatrixEntry> entries = ... // a JavaRDD of matrix entries
author	Sean Owen <sowen@cloudera.com>	2014-05-06 20:07:22 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-05-06 20:07:22 -0700
commit	25ad8f93012730115a8a1fac649fe3e842c045b3 (patch)
tree	6bc0dfec7014289e39f4c5c9070ed121e00c4398 /docs/mllib-basics.md
parent	a000b5c3b0438c17e9973df4832c320210c29c27 (diff)
download	spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.gz spark-25ad8f93012730115a8a1fac649fe3e842c045b3.tar.bz2 spark-25ad8f93012730115a8a1fac649fe3e842c045b3.zip