diff options
author | MechCoder <manojkumarsivaraj334@gmail.com> | 2015-07-24 14:58:07 -0700 |
---|---|---|
committer | Joseph K. Bradley <joseph@databricks.com> | 2015-07-24 14:58:07 -0700 |
commit | a400ab516fa93185aa683a596f9d7c6c1a02f330 (patch) | |
tree | 0dfc4adc09cd782fedb5a0d30d09e061f55bee61 /mllib/src/test/scala | |
parent | 64135cbb3363e3b74dad3c0498cb9959c047d381 (diff) | |
download | spark-a400ab516fa93185aa683a596f9d7c6c1a02f330.tar.gz spark-a400ab516fa93185aa683a596f9d7c6c1a02f330.tar.bz2 spark-a400ab516fa93185aa683a596f9d7c6c1a02f330.zip |
[SPARK-7045] [MLLIB] Avoid intermediate representation when creating model
Word2Vec used to convert from an Array[Float] representation to a Map[String, Array[Float]] and then back to an Array[Float] through Word2VecModel.
This prevents this conversion while still supporting the older method of supplying a Map.
Author: MechCoder <manojkumarsivaraj334@gmail.com>
Closes #5748 from MechCoder/spark-7045 and squashes the following commits:
e308913 [MechCoder] move docs
5703116 [MechCoder] minor
fa04313 [MechCoder] style fixes
b1d61c4 [MechCoder] better errors and tests
3b32c8c [MechCoder] [SPARK-7045] Avoid intermediate representation when creating model
Diffstat (limited to 'mllib/src/test/scala')
-rw-r--r-- | mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala b/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala index b681836920..4cc8d1129b 100644 --- a/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala +++ b/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala @@ -37,6 +37,12 @@ class Word2VecSuite extends SparkFunSuite with MLlibTestSparkContext { assert(syms.length == 2) assert(syms(0)._1 == "b") assert(syms(1)._1 == "c") + + // Test that model built using Word2Vec, i.e wordVectors and wordIndec + // and a Word2VecMap give the same values. + val word2VecMap = model.getVectors + val newModel = new Word2VecModel(word2VecMap) + assert(newModel.getVectors.mapValues(_.toSeq) === word2VecMap.mapValues(_.toSeq)) } test("Word2VecModel") { |