aboutsummaryrefslogtreecommitdiff
path: root/external
diff options
context:
space:
mode:
authorsethah <seth.hendrickson16@gmail.com>2016-04-06 17:13:34 -0700
committerJoseph K. Bradley <joseph@databricks.com>2016-04-06 17:13:34 -0700
commitbb873754b4700104755ab969694bf30945557dc3 (patch)
tree02b5b39b530827fea0871ade32e4b8927edb7e9a /external
parent864d1b4d665e2cc1d40b53502a4ddf26c1fbfc1d (diff)
downloadspark-bb873754b4700104755ab969694bf30945557dc3.tar.gz
spark-bb873754b4700104755ab969694bf30945557dc3.tar.bz2
spark-bb873754b4700104755ab969694bf30945557dc3.zip
[SPARK-12382][ML] Remove mllib GBT implementation and wrap ml
## What changes were proposed in this pull request? This patch removes the implementation of gradient boosted trees in mllib/tree/GradientBoostedTrees.scala and changes mllib GBTs to call the implementation in spark.ML. Primary changes: * Removed `boost` method in mllib GradientBoostedTrees.scala * Created new test suite GradientBoostedTreesSuite in ML, which contains unit tests that were specific to GBT internals from mllib Other changes: * Added an `updatePrediction` method in GradientBoostedTrees package. This method is added to provide consistency for methods that build predictions from boosted models. There are several methods that hard code the method of predicting as: sum_{i=1}^{numTrees} (treePrediction*treeWeight). Calling this function ensures that test methods that check accuracy use the same prediction method that the algorithm uses during training * Added methods that were previously only used in testing, but were public methods, to GradientBoostedTrees. This includes `computeError` (previously part of `Loss` trait) and `evaluateEachIteration`. These are used in the new spark.ML unit tests. They are left in mllib as well so as to not break the API. ## How was this patch tested? Existing unit tests which compare ML and MLlib ensure that mllib GBTs have not changed. Only a single unit test was moved to ML, which verifies that `runWithValidation` performs as expected. Author: sethah <seth.hendrickson16@gmail.com> Closes #12050 from sethah/SPARK-12382.
Diffstat (limited to 'external')
0 files changed, 0 insertions, 0 deletions