[SPARK-2843][MLLIB] add a section about regularization parameter in ALS

atalwalkar srowen Author: Xiangrui Meng <meng@databricks.com> Closes #2064 from mengxr/als-doc and squashes the following commits: b2e20ab [Xiangrui Meng] introduced -> discussed 98abdd7 [Xiangrui Meng] add reference 339bd08 [Xiangrui Meng] add a section about regularization parameter in ALS
author: Xiangrui Meng <meng@databricks.com> 2014-08-20 17:47:39 -0700
committer: Xiangrui Meng <meng@databricks.com> 2014-08-20 17:47:39 -0700
commit: e0f946265b9ea5bc48849cf7794c2c03d5e29fba (patch)
tree: 08a29e214853575e0c9366a5294dce060b81a3f1 /docs/mllib-collaborative-filtering.md
parent: e1571874f26c1df2dfd5ac2959612372716cd2d8 (diff)
download: spark-e0f946265b9ea5bc48849cf7794c2c03d5e29fba.tar.gz
spark-e0f946265b9ea5bc48849cf7794c2c03d5e29fba.tar.bz2
spark-e0f946265b9ea5bc48849cf7794c2c03d5e29fba.zip
1 files changed, 11 insertions, 0 deletions
diff --git a/docs/mllib-collaborative-filtering.md b/docs/mllib-collaborative-filtering.md
index ab10b2f01f..d5c539db79 100644
--- a/docs/mllib-collaborative-filtering.md
+++ b/docs/mllib-collaborative-filtering.md
@@ -43,6 +43,17 @@ level of confidence in observed user preferences, rather than explicit ratings g
 model then tries to find latent factors that can be used to predict the expected preference of a
 user for an item.
 
+### Scaling of the regularization parameter
+
+Since v1.1, we scale the regularization parameter `lambda` in solving each least squares problem by
+the number of ratings the user generated in updating user factors,
+or the number of ratings the product received in updating product factors.
+This approach is named "ALS-WR" and discussed in the paper
+"[Large-Scale Parallel Collaborative Filtering for the Netflix Prize](http://dx.doi.org/10.1007/978-3-540-68880-8_32)".
+It makes `lambda` less dependent on the scale of the dataset.
+So we can apply the best parameter learned from a sampled subset to the full dataset
+and expect similar performance.
+
 ## Examples
 
 <div class="codetabs">
author	Xiangrui Meng <meng@databricks.com>	2014-08-20 17:47:39 -0700
committer	Xiangrui Meng <meng@databricks.com>	2014-08-20 17:47:39 -0700
commit	e0f946265b9ea5bc48849cf7794c2c03d5e29fba (patch)
tree	08a29e214853575e0c9366a5294dce060b81a3f1 /docs/mllib-collaborative-filtering.md
parent	e1571874f26c1df2dfd5ac2959612372716cd2d8 (diff)
download	spark-e0f946265b9ea5bc48849cf7794c2c03d5e29fba.tar.gz spark-e0f946265b9ea5bc48849cf7794c2c03d5e29fba.tar.bz2 spark-e0f946265b9ea5bc48849cf7794c2c03d5e29fba.zip