diff options
author | Liang-Chi Hsieh <viirya@gmail.com> | 2015-02-02 19:34:25 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-02-02 19:34:25 -0800 |
commit | 1bcd46574e442e20f55709d70573f271ce44e5b9 (patch) | |
tree | d54d597053d9aab0191dca30ad2edfa34b402f45 /mllib/src/test | |
parent | 0561c4544967fb853419f32e014fac9b8879b0db (diff) | |
download | spark-1bcd46574e442e20f55709d70573f271ce44e5b9.tar.gz spark-1bcd46574e442e20f55709d70573f271ce44e5b9.tar.bz2 spark-1bcd46574e442e20f55709d70573f271ce44e5b9.zip |
[SPARK-5512][Mllib] Run the PIC algorithm with initial vector suggected by the PIC paper
As suggested by the paper of Power Iteration Clustering, it is useful to set the initial vector v0 as the degree vector d. This pr tries to add a running method for that.
Author: Liang-Chi Hsieh <viirya@gmail.com>
Closes #4301 from viirya/pic_degreevector and squashes the following commits:
7db28fb [Liang-Chi Hsieh] Refactor it to address comments.
19cf94e [Liang-Chi Hsieh] Add an option to select initialization method.
ec88567 [Liang-Chi Hsieh] Run the PIC algorithm with degree vector d as suggected by the PIC paper.
Diffstat (limited to 'mllib/src/test')
-rw-r--r-- | mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala b/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala index 2bae465d39..03ecd9ca73 100644 --- a/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala +++ b/mllib/src/test/scala/org/apache/spark/mllib/clustering/PowerIterationClusteringSuite.scala @@ -55,6 +55,16 @@ class PowerIterationClusteringSuite extends FunSuite with MLlibTestSparkContext predictions(c) += i } assert(predictions.toSet == Set((0 to 3).toSet, (4 to 15).toSet)) + + val model2 = new PowerIterationClustering() + .setK(2) + .setInitializationMode("degree") + .run(sc.parallelize(similarities, 2)) + val predictions2 = Array.fill(2)(mutable.Set.empty[Long]) + model2.assignments.collect().foreach { case (i, c) => + predictions2(c) += i + } + assert(predictions2.toSet == Set((0 to 3).toSet, (4 to 15).toSet)) } test("normalize and powerIter") { |