aboutsummaryrefslogtreecommitdiff
path: root/mllib/src/test
diff options
context:
space:
mode:
authorLiang-Chi Hsieh <viirya@gmail.com>2015-02-02 19:34:25 -0800
committerXiangrui Meng <meng@databricks.com>2015-02-02 19:34:25 -0800
commit1bcd46574e442e20f55709d70573f271ce44e5b9 (patch)
treed54d597053d9aab0191dca30ad2edfa34b402f45 /mllib/src/test
parent0561c4544967fb853419f32e014fac9b8879b0db (diff)
downloadspark-1bcd46574e442e20f55709d70573f271ce44e5b9.tar.gz
spark-1bcd46574e442e20f55709d70573f271ce44e5b9.tar.bz2
spark-1bcd46574e442e20f55709d70573f271ce44e5b9.zip
[SPARK-5512][Mllib] Run the PIC algorithm with initial vector suggected by the PIC paper
As suggested by the paper of Power Iteration Clustering, it is useful to set the initial vector v0 as the degree vector d. This pr tries to add a running method for that. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #4301 from viirya/pic_degreevector and squashes the following commits: 7db28fb [Liang-Chi Hsieh] Refactor it to address comments. 19cf94e [Liang-Chi Hsieh] Add an option to select initialization method. ec88567 [Liang-Chi Hsieh] Run the PIC algorithm with degree vector d as suggected by the PIC paper.
Diffstat (limited to 'mllib/src/test')
-rw-r--r--mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala10
1 files changed, 10 insertions, 0 deletions
diff --git a/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala b/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala
index 2bae465d39..03ecd9ca73 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala
@@ -55,6 +55,16 @@ class PowerIterationClusteringSuite extends FunSuite with MLlibTestSparkContext
predictions(c) += i
}
assert(predictions.toSet == Set((0 to 3).toSet, (4 to 15).toSet))
+
+ val model2 = new PowerIterationClustering()
+ .setK(2)
+ .setInitializationMode("degree")
+ .run(sc.parallelize(similarities, 2))
+ val predictions2 = Array.fill(2)(mutable.Set.empty[Long])
+ model2.assignments.collect().foreach { case (i, c) =>
+ predictions2(c) += i
+ }
+ assert(predictions2.toSet == Set((0 to 3).toSet, (4 to 15).toSet))
}
test("normalize and powerIter") {