diff options
author | Dongjoon Hyun <dongjoon@apache.org> | 2016-06-11 12:55:38 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-06-11 12:55:38 +0100 |
commit | ad102af169c7344b30d3b84aa16452fcdc22542c (patch) | |
tree | 3ddc38bba4e271d6e361c7a880d12c030a76a532 /docs/mllib-data-types.md | |
parent | 3761330dd0151d7369d7fba4d4c344e9863990ef (diff) | |
download | spark-ad102af169c7344b30d3b84aa16452fcdc22542c.tar.gz spark-ad102af169c7344b30d3b84aa16452fcdc22542c.tar.bz2 spark-ad102af169c7344b30d3b84aa16452fcdc22542c.zip |
[SPARK-15883][MLLIB][DOCS] Fix broken links in mllib documents
## What changes were proposed in this pull request?
This issue fixes all broken links on Spark 2.0 preview MLLib documents. Also, this contains some editorial change.
**Fix broken links**
* mllib-data-types.md
* mllib-decision-tree.md
* mllib-ensembles.md
* mllib-feature-extraction.md
* mllib-pmml-model-export.md
* mllib-statistics.md
**Fix malformed section header and scala coding style**
* mllib-linear-methods.md
**Replace indirect forward links with direct one**
* ml-classification-regression.md
## How was this patch tested?
Manual tests (with `cd docs; jekyll build`.)
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes #13608 from dongjoon-hyun/SPARK-15883.
Diffstat (limited to 'docs/mllib-data-types.md')
-rw-r--r-- | docs/mllib-data-types.md | 16 |
1 files changed, 6 insertions, 10 deletions
diff --git a/docs/mllib-data-types.md b/docs/mllib-data-types.md index 2ffe0f1c2b..ef56aebbc3 100644 --- a/docs/mllib-data-types.md +++ b/docs/mllib-data-types.md @@ -33,7 +33,7 @@ implementations: [`DenseVector`](api/scala/index.html#org.apache.spark.mllib.lin using the factory methods implemented in [`Vectors`](api/scala/index.html#org.apache.spark.mllib.linalg.Vectors$) to create local vectors. -Refer to the [`Vector` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Vector) and [`Vectors` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Vectors) for details on the API. +Refer to the [`Vector` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Vector) and [`Vectors` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Vectors$) for details on the API. {% highlight scala %} import org.apache.spark.mllib.linalg.{Vector, Vectors} @@ -199,7 +199,7 @@ After loading, the feature indices are converted to zero-based. [`MLUtils.loadLibSVMFile`](api/scala/index.html#org.apache.spark.mllib.util.MLUtils$) reads training examples stored in LIBSVM format. -Refer to the [`MLUtils` Scala docs](api/scala/index.html#org.apache.spark.mllib.util.MLUtils) for details on the API. +Refer to the [`MLUtils` Scala docs](api/scala/index.html#org.apache.spark.mllib.util.MLUtils$) for details on the API. {% highlight scala %} import org.apache.spark.mllib.regression.LabeledPoint @@ -264,7 +264,7 @@ We recommend using the factory methods implemented in [`Matrices`](api/scala/index.html#org.apache.spark.mllib.linalg.Matrices$) to create local matrices. Remember, local matrices in MLlib are stored in column-major order. -Refer to the [`Matrix` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Matrix) and [`Matrices` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Matrices) for details on the API. +Refer to the [`Matrix` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Matrix) and [`Matrices` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Matrices$) for details on the API. {% highlight scala %} import org.apache.spark.mllib.linalg.{Matrix, Matrices} @@ -331,7 +331,7 @@ sm = Matrices.sparse(3, 2, [0, 1, 3], [0, 2, 1], [9, 6, 8]) A distributed matrix has long-typed row and column indices and double-typed values, stored distributively in one or more RDDs. It is very important to choose the right format to store large and distributed matrices. Converting a distributed matrix to a different format may require a -global shuffle, which is quite expensive. Three types of distributed matrices have been implemented +global shuffle, which is quite expensive. Four types of distributed matrices have been implemented so far. The basic type is called `RowMatrix`. A `RowMatrix` is a row-oriented distributed @@ -344,6 +344,8 @@ An `IndexedRowMatrix` is similar to a `RowMatrix` but with row indices, which can be used for identifying rows and executing joins. A `CoordinateMatrix` is a distributed matrix stored in [coordinate list (COO)](https://en.wikipedia.org/wiki/Sparse_matrix#Coordinate_list_.28COO.29) format, backed by an RDD of its entries. +A `BlockMatrix` is a distributed matrix backed by an RDD of `MatrixBlock` +which is a tuple of `(Int, Int, Matrix)`. ***Note*** @@ -535,12 +537,6 @@ rowsRDD = mat.rows # Convert to a RowMatrix by dropping the row indices. rowMat = mat.toRowMatrix() - -# Convert to a CoordinateMatrix. -coordinateMat = mat.toCoordinateMatrix() - -# Convert to a BlockMatrix. -blockMat = mat.toBlockMatrix() {% endhighlight %} </div> |