diff options
author | Joseph K. Bradley <joseph@databricks.com> | 2014-11-05 10:33:13 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-11-05 10:33:22 -0800 |
commit | 9cba88c7f9fdf151217716e4cc5fa75995736922 (patch) | |
tree | 3b93e2411b2abf4218ccb776455aade06988377f /docs | |
parent | 46654b0661257f432932c6efc09c4c0983521834 (diff) | |
download | spark-9cba88c7f9fdf151217716e4cc5fa75995736922.tar.gz spark-9cba88c7f9fdf151217716e4cc5fa75995736922.tar.bz2 spark-9cba88c7f9fdf151217716e4cc5fa75995736922.zip |
[SPARK-4197] [mllib] GradientBoosting API cleanup and examples in Scala, Java
### Summary
* Made it easier to construct default Strategy and BoostingStrategy and to set parameters using simple types.
* Added Scala and Java examples for GradientBoostedTrees
* small cleanups and fixes
### Details
GradientBoosting bug fixes (“bug” = bad default options)
* Force boostingStrategy.weakLearnerParams.algo = Regression
* Force boostingStrategy.weakLearnerParams.impurity = impurity.Variance
* Only persist data if not yet persisted (since it causes an error if persisted twice)
BoostingStrategy
* numEstimators: renamed to numIterations
* removed subsamplingRate (duplicated by Strategy)
* removed categoricalFeaturesInfo since it belongs with the weak learner params (since boosting can be oblivious to feature type)
* Changed algo to var (not val) and added BeanProperty, with overload taking String argument
* Added assertValid() method
* Updated defaultParams() method and eliminated defaultWeakLearnerParams() since that belongs in Strategy
Strategy (for DecisionTree)
* Changed algo to var (not val) and added BeanProperty, with overload taking String argument
* Added setCategoricalFeaturesInfo method taking Java Map.
* Cleaned up assertValid
* Changed val’s to def’s since parameters can now be changed.
CC: manishamde mengxr codedeft
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #3094 from jkbradley/gbt-api and squashes the following commits:
7a27e22 [Joseph K. Bradley] scalastyle fix
52013d5 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into gbt-api
e9b8410 [Joseph K. Bradley] Summary of changes
(cherry picked from commit 5b3b6f6f5f029164d7749366506e142b104c1d43)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
Diffstat (limited to 'docs')
0 files changed, 0 insertions, 0 deletions