diff options
author | Joseph K. Bradley <joseph@databricks.com> | 2016-03-16 14:18:35 -0700 |
---|---|---|
committer | Joseph K. Bradley <joseph@databricks.com> | 2016-03-16 14:18:35 -0700 |
commit | 6fc2b6541fd5ab73b289af5f7296fc602b5b4dce (patch) | |
tree | ec8da69765b849a72e0faf5914f25a6dbd4d21f6 /project/MimaExcludes.scala | |
parent | 3f06eb72ca0c3e5779a702c7c677229e0c480751 (diff) | |
download | spark-6fc2b6541fd5ab73b289af5f7296fc602b5b4dce.tar.gz spark-6fc2b6541fd5ab73b289af5f7296fc602b5b4dce.tar.bz2 spark-6fc2b6541fd5ab73b289af5f7296fc602b5b4dce.zip |
[SPARK-11888][ML] Decision tree persistence in spark.ml
### What changes were proposed in this pull request?
Made these MLReadable and MLWritable: DecisionTreeClassifier, DecisionTreeClassificationModel, DecisionTreeRegressor, DecisionTreeRegressionModel
* The shared implementation is in treeModels.scala
* I use case classes to create a DataFrame to save, and I use the Dataset API to parse loaded files.
Other changes:
* Made CategoricalSplit.numCategories public (to use in persistence)
* Fixed a bug in DefaultReadWriteTest.testEstimatorAndModelReadWrite, where it did not call the checkModelData function passed as an argument. This caused an error in LDASuite, which I fixed.
### How was this patch tested?
Persistence is tested via unit tests. For each algorithm, there are 2 non-trivial trees (depth 2). One is built with continuous features, and one with categorical; this ensures that both types of splits are tested.
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #11581 from jkbradley/dt-io.
Diffstat (limited to 'project/MimaExcludes.scala')
0 files changed, 0 insertions, 0 deletions