diff options
author | Yanbo Liang <ybliang8@gmail.com> | 2016-06-28 11:54:25 -0700 |
---|---|---|
committer | Joseph K. Bradley <joseph@databricks.com> | 2016-06-28 11:54:25 -0700 |
commit | 26252f7064ba852e1bce6d8233a95aeb395f826a (patch) | |
tree | fbabb2251d98728444eb61c4a898db1e73bf2c6e /docs/mllib-migration-guides.md | |
parent | 1f2776df6e87a84991537ac20e4b8829472d3462 (diff) | |
download | spark-26252f7064ba852e1bce6d8233a95aeb395f826a.tar.gz spark-26252f7064ba852e1bce6d8233a95aeb395f826a.tar.bz2 spark-26252f7064ba852e1bce6d8233a95aeb395f826a.zip |
[SPARK-15643][DOC][ML] Update spark.ml and spark.mllib migration guide from 1.6 to 2.0
## What changes were proposed in this pull request?
Update ```spark.ml``` and ```spark.mllib``` migration guide from 1.6 to 2.0.
## How was this patch tested?
Docs update, no tests.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes #13378 from yanboliang/spark-13448.
Diffstat (limited to 'docs/mllib-migration-guides.md')
-rw-r--r-- | docs/mllib-migration-guides.md | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/docs/mllib-migration-guides.md b/docs/mllib-migration-guides.md index f3daef2dba..970c6697f4 100644 --- a/docs/mllib-migration-guides.md +++ b/docs/mllib-migration-guides.md @@ -7,6 +7,33 @@ description: MLlib migration guides from before Spark SPARK_VERSION_SHORT The migration guide for the current Spark version is kept on the [MLlib Programming Guide main page](mllib-guide.html#migration-guide). +## From 1.5 to 1.6 + +There are no breaking API changes in the `spark.mllib` or `spark.ml` packages, but there are +deprecations and changes of behavior. + +Deprecations: + +* [SPARK-11358](https://issues.apache.org/jira/browse/SPARK-11358): + In `spark.mllib.clustering.KMeans`, the `runs` parameter has been deprecated. +* [SPARK-10592](https://issues.apache.org/jira/browse/SPARK-10592): + In `spark.ml.classification.LogisticRegressionModel` and + `spark.ml.regression.LinearRegressionModel`, the `weights` field has been deprecated in favor of + the new name `coefficients`. This helps disambiguate from instance (row) "weights" given to + algorithms. + +Changes of behavior: + +* [SPARK-7770](https://issues.apache.org/jira/browse/SPARK-7770): + `spark.mllib.tree.GradientBoostedTrees`: `validationTol` has changed semantics in 1.6. + Previously, it was a threshold for absolute change in error. Now, it resembles the behavior of + `GradientDescent`'s `convergenceTol`: For large errors, it uses relative error (relative to the + previous error); for small errors (`< 0.01`), it uses absolute error. +* [SPARK-11069](https://issues.apache.org/jira/browse/SPARK-11069): + `spark.ml.feature.RegexTokenizer`: Previously, it did not convert strings to lowercase before + tokenizing. Now, it converts to lowercase by default, with an option not to. This matches the + behavior of the simpler `Tokenizer` transformer. + ## From 1.4 to 1.5 In the `spark.mllib` package, there are no breaking API changes but several behavior changes: |