aboutsummaryrefslogtreecommitdiff
path: root/mllib-local
diff options
context:
space:
mode:
authorTimothy Hunter <timhunter@databricks.com>2017-03-23 18:42:13 -0700
committerJoseph K. Bradley <joseph@databricks.com>2017-03-23 18:42:13 -0700
commitd27daa54bd341b29737a6352d9a1055151248ae7 (patch)
treeaa91b5865e37cb2b715a31acf371878a96bb4518 /mllib-local
parent93581fbc18c01595918c565f6737aaa666116114 (diff)
downloadspark-d27daa54bd341b29737a6352d9a1055151248ae7.tar.gz
spark-d27daa54bd341b29737a6352d9a1055151248ae7.tar.bz2
spark-d27daa54bd341b29737a6352d9a1055151248ae7.zip
[SPARK-19636][ML] Feature parity for correlation statistics in MLlib
## What changes were proposed in this pull request? This patch adds the Dataframes-based support for the correlation statistics found in the `org.apache.spark.mllib.stat.correlation.Statistics`, following the design doc discussed in the JIRA ticket. The current implementation is a simple wrapper around the `spark.mllib` implementation. Future optimizations can be implemented at a later stage. ## How was this patch tested? ``` build/sbt "testOnly org.apache.spark.ml.stat.StatisticsSuite" ``` Author: Timothy Hunter <timhunter@databricks.com> Closes #17108 from thunterdb/19636.
Diffstat (limited to 'mllib-local')
-rw-r--r--mllib-local/src/test/scala/org/apache/spark/ml/util/TestingUtils.scala8
1 files changed, 8 insertions, 0 deletions
diff --git a/mllib-local/src/test/scala/org/apache/spark/ml/util/TestingUtils.scala b/mllib-local/src/test/scala/org/apache/spark/ml/util/TestingUtils.scala
index 2327917e2c..30edd00fb5 100644
--- a/mllib-local/src/test/scala/org/apache/spark/ml/util/TestingUtils.scala
+++ b/mllib-local/src/test/scala/org/apache/spark/ml/util/TestingUtils.scala
@@ -32,6 +32,10 @@ object TestingUtils {
* the relative tolerance is meaningless, so the exception will be raised to warn users.
*/
private def RelativeErrorComparison(x: Double, y: Double, eps: Double): Boolean = {
+ // Special case for NaNs
+ if (x.isNaN && y.isNaN) {
+ return true
+ }
val absX = math.abs(x)
val absY = math.abs(y)
val diff = math.abs(x - y)
@@ -49,6 +53,10 @@ object TestingUtils {
* Private helper function for comparing two values using absolute tolerance.
*/
private def AbsoluteErrorComparison(x: Double, y: Double, eps: Double): Boolean = {
+ // Special case for NaNs
+ if (x.isNaN && y.isNaN) {
+ return true
+ }
math.abs(x - y) < eps
}