aboutsummaryrefslogtreecommitdiff
path: root/mllib/src
diff options
context:
space:
mode:
authorReza Zadeh <rizlar@gmail.com>2014-09-15 17:41:15 -0700
committerXiangrui Meng <meng@databricks.com>2014-09-15 17:41:15 -0700
commit983d6a9c48b69c5f0542922aa8b133f69eb1034d (patch)
tree446e066bfa0f3c6a28fb6151bfa88ed25ca82b94 /mllib/src
parent3b93128139e8d303f1d7bfd04e9a99a11a5b6404 (diff)
downloadspark-983d6a9c48b69c5f0542922aa8b133f69eb1034d.tar.gz
spark-983d6a9c48b69c5f0542922aa8b133f69eb1034d.tar.bz2
spark-983d6a9c48b69c5f0542922aa8b133f69eb1034d.zip
[MLlib] Update SVD documentation in IndexedRowMatrix
Updating this to reflect the newest SVD via ARPACK Author: Reza Zadeh <rizlar@gmail.com> Closes #2389 from rezazadeh/irmdocs and squashes the following commits: 7fa1313 [Reza Zadeh] Update svd docs 715da25 [Reza Zadeh] Updated computeSVD documentation IndexedRowMatrix
Diffstat (limited to 'mllib/src')
-rw-r--r--mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/IndexedRowMatrix.scala12
1 files changed, 4 insertions, 8 deletions
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/IndexedRowMatrix.scala b/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/IndexedRowMatrix.scala
index ac6eaea3f4..5c1acca0ec 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/IndexedRowMatrix.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/IndexedRowMatrix.scala
@@ -76,16 +76,12 @@ class IndexedRowMatrix(
}
/**
- * Computes the singular value decomposition of this matrix.
+ * Computes the singular value decomposition of this IndexedRowMatrix.
* Denote this matrix by A (m x n), this will compute matrices U, S, V such that A = U * S * V'.
*
- * There is no restriction on m, but we require `n^2` doubles to fit in memory.
- * Further, n should be less than m.
-
- * The decomposition is computed by first computing A'A = V S^2 V',
- * computing svd locally on that (since n x n is small), from which we recover S and V.
- * Then we compute U via easy matrix multiplication as U = A * (V * S^-1).
- * Note that this approach requires `O(n^3)` time on the master node.
+ * The cost and implementation of this method is identical to that in
+ * [[org.apache.spark.mllib.linalg.distributed.RowMatrix]]
+ * With the addition of indices.
*
* At most k largest non-zero singular values and associated vectors are returned.
* If there are k such values, then the dimensions of the return will be: