aboutsummaryrefslogtreecommitdiff
path: root/mllib
diff options
context:
space:
mode:
authorDB Tsai <dbt@netflix.com>2016-04-15 01:17:03 -0700
committerXiangrui Meng <meng@databricks.com>2016-04-15 01:17:03 -0700
commit96534aa47c39e0ec40bc38be566455d11e21adb2 (patch)
tree3702b1573b3f6fb5b0bd682e7c70e55ab1868b51 /mllib
parenta9324a06ef0e3646410dc9b3d4f21d66b9064303 (diff)
downloadspark-96534aa47c39e0ec40bc38be566455d11e21adb2.tar.gz
spark-96534aa47c39e0ec40bc38be566455d11e21adb2.tar.bz2
spark-96534aa47c39e0ec40bc38be566455d11e21adb2.zip
[SPARK-14549][ML] Copy the Vector and Matrix classes from mllib to ml in mllib-local
## What changes were proposed in this pull request? This task will copy the Vector and Matrix classes from mllib to ml package in mllib-local jar. The UDTs and `since` annotation in ml vector and matrix will be removed from now. UDTs will be achieved by #SPARK-14487, and `since` will be replaced by /* since 1.2.0 */ The BLAS implementation will be copied, and some of the test utilities will be copies as well. Summary of changes: 1. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/BLAS.scala - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/BLAS.scala - logDebug("gemm: alpha is equal to 0 and beta is equal to 1. Returning C.") is removed in ml version. 2. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/Matrices.scala - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Matrices.scala - `Since` was removed, and we'll use standard `/* Since /*` Java doc. Will be in another PR. - `UDT` related code was removed, and will use `SPARK-13944` https://github.com/apache/spark/pull/12259 to replace the annotation. 3. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/Vectors.scala - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Vectors.scala - `Since` was removed. - `UDT` related code was removed. - In `def parseNumeric`, it was throwing `throw new SparkException(s"Cannot parse $other.")`, and now it's throwing `throw new IllegalArgumentException(s"Cannot parse $other.")` 4. In mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Vectors.scala - For consistency with ML version of vector, `def parseNumeric` is now throwing `throw new IllegalArgumentException(s"Cannot parse $other.")` 5. mllib/src/main/scala/org/apache/spark/**mllib**/util/NumericParser.scala is moved to mllib-local/src/main/scala/org/apache/spark/**ml**/util/NumericParser.scala - All the `throw new SparkException` were replaced by `throw new IllegalArgumentException` ## How was this patch tested? unit tests Author: DB Tsai <dbt@netflix.com> Closes #12317 from dbtsai/dbtsai-ml-vector.
Diffstat (limited to 'mllib')
-rw-r--r--mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala4
1 files changed, 2 insertions, 2 deletions
diff --git a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
index e542f21a18..0c6aabf192 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
@@ -182,8 +182,8 @@ class MLUtilsSuite extends SparkFunSuite with MLlibTestSparkContext {
for (folds <- 2 to 10) {
for (seed <- 1 to 5) {
val foldedRdds = kFold(data, folds, seed)
- assert(foldedRdds.size === folds)
- foldedRdds.map { case (training, validation) =>
+ assert(foldedRdds.length === folds)
+ foldedRdds.foreach { case (training, validation) =>
val result = validation.union(training).collect().sorted
val validationSize = validation.collect().size.toFloat
assert(validationSize > 0, "empty validation data")