aboutsummaryrefslogtreecommitdiff
path: root/docs/mllib-migration-guides.md
diff options
context:
space:
mode:
authorJoseph K. Bradley <joseph@databricks.com>2015-06-21 16:25:25 -0700
committerXiangrui Meng <meng@databricks.com>2015-06-21 16:25:25 -0700
commita1894422ad6b3335c84c73ba9466da6677d893cb (patch)
tree8bba7cc2493b57e8e24f8f28003836c2b72cbec7 /docs/mllib-migration-guides.md
parent83cdfd84f8ca679e1ec451ed88b946e8e7f13a94 (diff)
downloadspark-a1894422ad6b3335c84c73ba9466da6677d893cb.tar.gz
spark-a1894422ad6b3335c84c73ba9466da6677d893cb.tar.bz2
spark-a1894422ad6b3335c84c73ba9466da6677d893cb.zip
[SPARK-7715] [MLLIB] [ML] [DOC] Updated MLlib programming guide for release 1.4
Reorganized docs a bit. Added migration guides. **Q**: Do we want to say more for the 1.3 -> 1.4 migration guide for ```spark.ml```? It would be a lot. CC: mengxr Author: Joseph K. Bradley <joseph@databricks.com> Closes #6897 from jkbradley/ml-guide-1.4 and squashes the following commits: 4bf26d6 [Joseph K. Bradley] tiny fix 8085067 [Joseph K. Bradley] fixed spacing/layout issues in ml guide from previous commit in this PR 6cd5c78 [Joseph K. Bradley] Updated MLlib programming guide for release 1.4
Diffstat (limited to 'docs/mllib-migration-guides.md')
-rw-r--r--docs/mllib-migration-guides.md16
1 files changed, 16 insertions, 0 deletions
diff --git a/docs/mllib-migration-guides.md b/docs/mllib-migration-guides.md
index 4de2d9491a..8df68d81f3 100644
--- a/docs/mllib-migration-guides.md
+++ b/docs/mllib-migration-guides.md
@@ -7,6 +7,22 @@ description: MLlib migration guides from before Spark SPARK_VERSION_SHORT
The migration guide for the current Spark version is kept on the [MLlib Programming Guide main page](mllib-guide.html#migration-guide).
+## From 1.2 to 1.3
+
+In the `spark.mllib` package, there were several breaking changes. The first change (in `ALS`) is the only one in a component not marked as Alpha or Experimental.
+
+* *(Breaking change)* In [`ALS`](api/scala/index.html#org.apache.spark.mllib.recommendation.ALS), the extraneous method `solveLeastSquares` has been removed. The `DeveloperApi` method `analyzeBlocks` was also removed.
+* *(Breaking change)* [`StandardScalerModel`](api/scala/index.html#org.apache.spark.mllib.feature.StandardScalerModel) remains an Alpha component. In it, the `variance` method has been replaced with the `std` method. To compute the column variance values returned by the original `variance` method, simply square the standard deviation values returned by `std`.
+* *(Breaking change)* [`StreamingLinearRegressionWithSGD`](api/scala/index.html#org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD) remains an Experimental component. In it, there were two changes:
+ * The constructor taking arguments was removed in favor of a builder pattern using the default constructor plus parameter setter methods.
+ * Variable `model` is no longer public.
+* *(Breaking change)* [`DecisionTree`](api/scala/index.html#org.apache.spark.mllib.tree.DecisionTree) remains an Experimental component. In it and its associated classes, there were several changes:
+ * In `DecisionTree`, the deprecated class method `train` has been removed. (The object/static `train` methods remain.)
+ * In `Strategy`, the `checkpointDir` parameter has been removed. Checkpointing is still supported, but the checkpoint directory must be set before calling tree and tree ensemble training.
+* `PythonMLlibAPI` (the interface between Scala/Java and Python for MLlib) was a public API but is now private, declared `private[python]`. This was never meant for external use.
+* In linear regression (including Lasso and ridge regression), the squared loss is now divided by 2.
+ So in order to produce the same result as in 1.2, the regularization parameter needs to be divided by 2 and the step size needs to be multiplied by 2.
+
## From 1.1 to 1.2
The only API changes in MLlib v1.2 are in