diff options
author | Yanbo Liang <ybliang8@gmail.com> | 2016-07-29 04:40:20 -0700 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-07-29 04:40:20 -0700 |
commit | 0557a45452f6e73877e5ec972110825ce8f3fbc5 (patch) | |
tree | 28b18541ba9bfc1217041a08a2210c3d5835c757 /dev/deps/spark-deps-hadoop-2.2 | |
parent | d1d5069aa3744d46abd3889abab5f15e9067382a (diff) | |
download | spark-0557a45452f6e73877e5ec972110825ce8f3fbc5.tar.gz spark-0557a45452f6e73877e5ec972110825ce8f3fbc5.tar.bz2 spark-0557a45452f6e73877e5ec972110825ce8f3fbc5.zip |
[SPARK-16750][ML] Fix GaussianMixture training failed due to feature column type mistake
## What changes were proposed in this pull request?
ML ```GaussianMixture``` training failed due to feature column type mistake. The feature column type should be ```ml.linalg.VectorUDT``` but got ```mllib.linalg.VectorUDT``` by mistake.
See [SPARK-16750](https://issues.apache.org/jira/browse/SPARK-16750) for how to reproduce this bug.
Why the unit tests did not complain this errors? Because some estimators/transformers missed calling ```transformSchema(dataset.schema)``` firstly during ```fit``` or ```transform```. I will also add this function to all estimators/transformers who missed in this PR.
## How was this patch tested?
No new tests, should pass existing ones.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes #14378 from yanboliang/spark-16750.
Diffstat (limited to 'dev/deps/spark-deps-hadoop-2.2')
0 files changed, 0 insertions, 0 deletions