aboutsummaryrefslogtreecommitdiff
path: root/pom.xml
diff options
context:
space:
mode:
authorJoseph K. Bradley <joseph@databricks.com>2016-04-28 16:20:00 -0700
committerJoseph K. Bradley <joseph@databricks.com>2016-04-28 16:20:00 -0700
commit4f4721a21cc9acc2b6f685bbfc8757d29563a775 (patch)
tree6cd62a33cb375e32ba72abfba71c2cf9b64df616 /pom.xml
parentdae538a4d7c36191c1feb02ba87ffc624ab960dc (diff)
downloadspark-4f4721a21cc9acc2b6f685bbfc8757d29563a775.tar.gz
spark-4f4721a21cc9acc2b6f685bbfc8757d29563a775.tar.bz2
spark-4f4721a21cc9acc2b6f685bbfc8757d29563a775.zip
[SPARK-14862][ML] Updated Classifiers to not require labelCol metadata
## What changes were proposed in this pull request? Updated Classifier, DecisionTreeClassifier, RandomForestClassifier, GBTClassifier to not require input column metadata. * They first check for metadata. * If numClasses is not specified in metadata, they identify the largest label value (up to a limit). This functionality is implemented in a new Classifier.getNumClasses method. Also * Updated Classifier.extractLabeledPoints to (a) check label values and (b) include a second version which takes a numClasses value for validity checking. ## How was this patch tested? * Unit tests in ClassifierSuite for helper methods * Unit tests for DecisionTreeClassifier, RandomForestClassifier, GBTClassifier with toy datasets lacking label metadata Author: Joseph K. Bradley <joseph@databricks.com> Closes #12663 from jkbradley/trees-no-metadata.
Diffstat (limited to 'pom.xml')
0 files changed, 0 insertions, 0 deletions