diff options
author | Joseph K. Bradley <joseph@databricks.com> | 2016-04-26 16:53:16 -0700 |
---|---|---|
committer | DB Tsai <dbt@netflix.com> | 2016-04-26 16:53:16 -0700 |
commit | bd2c9a6d48ef6d489c747d9db2642bdef6b1f728 (patch) | |
tree | 9a8a4864825aca4e8f11d4442d33e1ca4f7ac0c4 /CONTRIBUTING.md | |
parent | 0c99c23b7d9f0c3538cd2b062d551411712a2bcc (diff) | |
download | spark-bd2c9a6d48ef6d489c747d9db2642bdef6b1f728.tar.gz spark-bd2c9a6d48ef6d489c747d9db2642bdef6b1f728.tar.bz2 spark-bd2c9a6d48ef6d489c747d9db2642bdef6b1f728.zip |
[SPARK-14732][ML] spark.ml GaussianMixture should use MultivariateGaussian in mllib-local
## What changes were proposed in this pull request?
Before, spark.ml GaussianMixtureModel used the spark.mllib MultivariateGaussian in its public API. This was added after 1.6, so we can modify this API without breaking APIs.
This PR copies MultivariateGaussian to mllib-local in spark.ml, with a few changes:
* Renamed fields to match numpy, scipy: mu => mean, sigma => cov
This PR then uses the spark.ml MultivariateGaussian in the spark.ml GaussianMixtureModel, which involves:
* Modifying the constructor
* Adding a computeProbabilities method
Also:
* Added EPSILON to mllib-local for use in MultivariateGaussian
## How was this patch tested?
Existing unit tests
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #12593 from jkbradley/sparkml-gmm-fix.
Diffstat (limited to 'CONTRIBUTING.md')
0 files changed, 0 insertions, 0 deletions