diff options
author | Joseph K. Bradley <joseph@databricks.com> | 2014-11-05 10:33:13 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-11-05 10:33:13 -0800 |
commit | 5b3b6f6f5f029164d7749366506e142b104c1d43 (patch) | |
tree | 78dc97e3f62d803e0f5cd3837d6a454f27e6e155 /python/lib | |
parent | 5f13759d3642ea5b58c12a756e7125ac19aff10e (diff) | |
download | spark-5b3b6f6f5f029164d7749366506e142b104c1d43.tar.gz spark-5b3b6f6f5f029164d7749366506e142b104c1d43.tar.bz2 spark-5b3b6f6f5f029164d7749366506e142b104c1d43.zip |
[SPARK-4197] [mllib] GradientBoosting API cleanup and examples in Scala, Java
### Summary
* Made it easier to construct default Strategy and BoostingStrategy and to set parameters using simple types.
* Added Scala and Java examples for GradientBoostedTrees
* small cleanups and fixes
### Details
GradientBoosting bug fixes (“bug” = bad default options)
* Force boostingStrategy.weakLearnerParams.algo = Regression
* Force boostingStrategy.weakLearnerParams.impurity = impurity.Variance
* Only persist data if not yet persisted (since it causes an error if persisted twice)
BoostingStrategy
* numEstimators: renamed to numIterations
* removed subsamplingRate (duplicated by Strategy)
* removed categoricalFeaturesInfo since it belongs with the weak learner params (since boosting can be oblivious to feature type)
* Changed algo to var (not val) and added BeanProperty, with overload taking String argument
* Added assertValid() method
* Updated defaultParams() method and eliminated defaultWeakLearnerParams() since that belongs in Strategy
Strategy (for DecisionTree)
* Changed algo to var (not val) and added BeanProperty, with overload taking String argument
* Added setCategoricalFeaturesInfo method taking Java Map.
* Cleaned up assertValid
* Changed val’s to def’s since parameters can now be changed.
CC: manishamde mengxr codedeft
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #3094 from jkbradley/gbt-api and squashes the following commits:
7a27e22 [Joseph K. Bradley] scalastyle fix
52013d5 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into gbt-api
e9b8410 [Joseph K. Bradley] Summary of changes
Diffstat (limited to 'python/lib')
0 files changed, 0 insertions, 0 deletions