diff options
author | Nick Lavers <nick.lavers@videoamp.com> | 2016-08-19 10:11:59 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-08-19 10:11:59 +0100 |
commit | 5377fc62360d5e9b5c94078e41d10a96e0e8a535 (patch) | |
tree | 1998db20af8d7cc93a2b00308c0f5e8e2b3166a9 /python/pyspark/ml/clustering.py | |
parent | 287bea13050b8eedc3b8b6b3491f1b5e5bc24d7a (diff) | |
download | spark-5377fc62360d5e9b5c94078e41d10a96e0e8a535.tar.gz spark-5377fc62360d5e9b5c94078e41d10a96e0e8a535.tar.bz2 spark-5377fc62360d5e9b5c94078e41d10a96e0e8a535.zip |
[SPARK-16961][CORE] Fixed off-by-one error that biased randomizeInPlace
JIRA issue link:
https://issues.apache.org/jira/browse/SPARK-16961
Changed one line of Utils.randomizeInPlace to allow elements to stay in place.
Created a unit test that runs a Pearson's chi squared test to determine whether the output diverges significantly from a uniform distribution.
Author: Nick Lavers <nick.lavers@videoamp.com>
Closes #14551 from nicklavers/SPARK-16961-randomizeInPlace.
Diffstat (limited to 'python/pyspark/ml/clustering.py')
-rw-r--r-- | python/pyspark/ml/clustering.py | 12 |
1 files changed, 6 insertions, 6 deletions
diff --git a/python/pyspark/ml/clustering.py b/python/pyspark/ml/clustering.py index 75d9a0e8ca..4dab83362a 100644 --- a/python/pyspark/ml/clustering.py +++ b/python/pyspark/ml/clustering.py @@ -99,9 +99,9 @@ class GaussianMixture(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIte +--------------------+--------------------+ | mean| cov| +--------------------+--------------------+ - |[-0.0550000000000...|0.002025000000000...| - |[0.82499999999999...|0.005625000000000...| - |[-0.87,-0.7200000...|0.001600000000000...| + |[0.82500000140229...|0.005625000000006...| + |[-0.4777098016092...|0.167969502720916...| + |[-0.4472625243352...|0.167304119758233...| +--------------------+--------------------+ ... >>> transformed = model.transform(df).select("features", "prediction") @@ -124,9 +124,9 @@ class GaussianMixture(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIte +--------------------+--------------------+ | mean| cov| +--------------------+--------------------+ - |[-0.0550000000000...|0.002025000000000...| - |[0.82499999999999...|0.005625000000000...| - |[-0.87,-0.7200000...|0.001600000000000...| + |[0.82500000140229...|0.005625000000006...| + |[-0.4777098016092...|0.167969502720916...| + |[-0.4472625243352...|0.167304119758233...| +--------------------+--------------------+ ... |