diff options
Diffstat (limited to 'docs/mllib-linear-algebra.md')
-rw-r--r-- | docs/mllib-linear-algebra.md | 13 |
1 files changed, 13 insertions, 0 deletions
diff --git a/docs/mllib-linear-algebra.md b/docs/mllib-linear-algebra.md index cc203d833d..09598be790 100644 --- a/docs/mllib-linear-algebra.md +++ b/docs/mllib-linear-algebra.md @@ -59,3 +59,16 @@ val = decomposed.S.data println("singular values = " + s.toArray.mkString) {% endhighlight %} + + +# Principal Component Analysis + +Computes the top k principal component coefficients for the m-by-n data matrix X. +Rows of X correspond to observations and columns correspond to variables. +The coefficient matrix is n-by-k. Each column of the return matrix contains coefficients +for one principal component, and the columns are in descending +order of component variance. This function centers the data and uses the +singular value decomposition (SVD) algorithm. + +All input and output is expected in DenseMatrix matrix format. See the examples directory +under "SparkPCA.scala" for example usage. |