diff options
author | Joseph K. Bradley <joseph.kurata.bradley@gmail.com> | 2014-08-03 10:36:52 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-08-03 10:36:52 -0700 |
commit | 2998e38a942351974da36cb619e863c6f0316e7a (patch) | |
tree | bba84386898fa3fd49cbc4acfc3b38457f0c4882 /examples | |
parent | a0bcbc159e89be868ccc96175dbf1439461557e1 (diff) | |
download | spark-2998e38a942351974da36cb619e863c6f0316e7a.tar.gz spark-2998e38a942351974da36cb619e863c6f0316e7a.tar.bz2 spark-2998e38a942351974da36cb619e863c6f0316e7a.zip |
[SPARK-2197] [mllib] Java DecisionTree bug fix and easy-of-use
Bug fix: Before, when an RDD was created in Java and passed to DecisionTree.train(), the fake class tag caused problems.
* Fix: DecisionTree: Used new RDD.retag() method to allow passing RDDs from Java.
Other improvements to Decision Trees for easy-of-use with Java:
* impurity classes: Added instance() methods to help with Java interface.
* Strategy: Added Java-friendly constructor
--> Note: I removed quantileCalculationStrategy from the Java-friendly constructor since (a) it is a special class and (b) there is only 1 option currently. I suspect we will redo the API before the other options are included.
CC: mengxr
Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
Closes #1740 from jkbradley/dt-java-new and squashes the following commits:
0805dc6 [Joseph K. Bradley] Changed Strategy to use JavaConverters instead of JavaConversions
519b1b7 [Joseph K. Bradley] * Organized imports in JavaDecisionTreeSuite.java * Using JavaConverters instead of JavaConversions in DecisionTreeSuite.scala
f7b5ca1 [Joseph K. Bradley] Improvements to make it easier to run DecisionTree from Java. * DecisionTree: Used new RDD.retag() method to allow passing RDDs from Java. * impurity classes: Added instance() methods to help with Java interface. * Strategy: Added Java-friendly constructor ** Note: I removed quantileCalculationStrategy from the Java-friendly constructor since (a) it is a special class and (b) there is only 1 option currently. I suspect we will redo the API before the other options are included.
d78ada6 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-java
320853f [Joseph K. Bradley] Added JavaDecisionTreeSuite, partly written
13a585e [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-java
f1a8283 [Joseph K. Bradley] Added old JavaDecisionTreeSuite, to be updated later
225822f [Joseph K. Bradley] Bug: In DecisionTree, the method sequentialBinSearchForOrderedCategoricalFeatureInClassification() indexed bins from 0 to (math.pow(2, featureCategories.toInt - 1) - 1). This upper bound is the bound for unordered categorical features, not ordered ones. The upper bound should be the arity (i.e., max value) of the feature.
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions