From 96534aa47c39e0ec40bc38be566455d11e21adb2 Mon Sep 17 00:00:00 2001 From: DB Tsai Date: Fri, 15 Apr 2016 01:17:03 -0700 Subject: [SPARK-14549][ML] Copy the Vector and Matrix classes from mllib to ml in mllib-local ## What changes were proposed in this pull request? This task will copy the Vector and Matrix classes from mllib to ml package in mllib-local jar. The UDTs and `since` annotation in ml vector and matrix will be removed from now. UDTs will be achieved by #SPARK-14487, and `since` will be replaced by /* since 1.2.0 */ The BLAS implementation will be copied, and some of the test utilities will be copies as well. Summary of changes: 1. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/BLAS.scala - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/BLAS.scala - logDebug("gemm: alpha is equal to 0 and beta is equal to 1. Returning C.") is removed in ml version. 2. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/Matrices.scala - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Matrices.scala - `Since` was removed, and we'll use standard `/* Since /*` Java doc. Will be in another PR. - `UDT` related code was removed, and will use `SPARK-13944` https://github.com/apache/spark/pull/12259 to replace the annotation. 3. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/Vectors.scala - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Vectors.scala - `Since` was removed. - `UDT` related code was removed. - In `def parseNumeric`, it was throwing `throw new SparkException(s"Cannot parse $other.")`, and now it's throwing `throw new IllegalArgumentException(s"Cannot parse $other.")` 4. In mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Vectors.scala - For consistency with ML version of vector, `def parseNumeric` is now throwing `throw new IllegalArgumentException(s"Cannot parse $other.")` 5. mllib/src/main/scala/org/apache/spark/**mllib**/util/NumericParser.scala is moved to mllib-local/src/main/scala/org/apache/spark/**ml**/util/NumericParser.scala - All the `throw new SparkException` were replaced by `throw new IllegalArgumentException` ## How was this patch tested? unit tests Author: DB Tsai Closes #12317 from dbtsai/dbtsai-ml-vector. --- mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'mllib') diff --git a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala index e542f21a18..0c6aabf192 100644 --- a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala +++ b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala @@ -182,8 +182,8 @@ class MLUtilsSuite extends SparkFunSuite with MLlibTestSparkContext { for (folds <- 2 to 10) { for (seed <- 1 to 5) { val foldedRdds = kFold(data, folds, seed) - assert(foldedRdds.size === folds) - foldedRdds.map { case (training, validation) => + assert(foldedRdds.length === folds) + foldedRdds.foreach { case (training, validation) => val result = validation.union(training).collect().sorted val validationSize = validation.collect().size.toFloat assert(validationSize > 0, "empty validation data") -- cgit v1.2.3