[SPARK-11888][ML] Decision tree persistence in spark.ml - spark

diff options

author	Joseph K. Bradley <joseph@databricks.com>	2016-03-16 14:18:35 -0700
committer	Joseph K. Bradley <joseph@databricks.com>	2016-03-16 14:18:35 -0700
commit	6fc2b6541fd5ab73b289af5f7296fc602b5b4dce (patch)
tree	ec8da69765b849a72e0faf5914f25a6dbd4d21f6 /project
parent	3f06eb72ca0c3e5779a702c7c677229e0c480751 (diff)
download	spark-6fc2b6541fd5ab73b289af5f7296fc602b5b4dce.tar.gz spark-6fc2b6541fd5ab73b289af5f7296fc602b5b4dce.tar.bz2 spark-6fc2b6541fd5ab73b289af5f7296fc602b5b4dce.zip

[SPARK-11888][ML] Decision tree persistence in spark.ml

### What changes were proposed in this pull request? Made these MLReadable and MLWritable: DecisionTreeClassifier, DecisionTreeClassificationModel, DecisionTreeRegressor, DecisionTreeRegressionModel * The shared implementation is in treeModels.scala * I use case classes to create a DataFrame to save, and I use the Dataset API to parse loaded files. Other changes: * Made CategoricalSplit.numCategories public (to use in persistence) * Fixed a bug in DefaultReadWriteTest.testEstimatorAndModelReadWrite, where it did not call the checkModelData function passed as an argument. This caused an error in LDASuite, which I fixed. ### How was this patch tested? Persistence is tested via unit tests. For each algorithm, there are 2 non-trivial trees (depth 2). One is built with continuous features, and one with categorical; this ensures that both types of splits are tested. Author: Joseph K. Bradley <joseph@databricks.com> Closes #11581 from jkbradley/dt-io.

Diffstat (limited to 'project')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: