aboutsummaryrefslogtreecommitdiff
path: root/core
diff options
context:
space:
mode:
authorShuo Xiang <sxiang@twitter.com>2014-06-12 17:37:06 -0700
committerXiangrui Meng <meng@databricks.com>2014-06-12 17:37:06 -0700
commita6e0afdcf0174425e8a6ff20b2bc2e3a7a374f19 (patch)
tree380e0147d1a617480bfa9efa14753c88f99ccc32 /core
parent1c04652c8f18566baafb13dbae355f8ad2ad8d37 (diff)
downloadspark-a6e0afdcf0174425e8a6ff20b2bc2e3a7a374f19.tar.gz
spark-a6e0afdcf0174425e8a6ff20b2bc2e3a7a374f19.tar.bz2
spark-a6e0afdcf0174425e8a6ff20b2bc2e3a7a374f19.zip
SPARK-2085: [MLlib] Apply user-specific regularization instead of uniform regularization in ALS
The current implementation of ALS takes a single regularization parameter and apply it on both of the user factors and the product factors. This kind of regularization can be less effective while user number is significantly larger than the number of products (and vice versa). For example, if we have 10M users and 1K product, regularization on user factors will dominate. Following the discussion in [this thread](http://apache-spark-user-list.1001560.n3.nabble.com/possible-bug-in-Spark-s-ALS-implementation-tt2567.html#a2704), the implementation in this PR will regularize each factor vector by #ratings * lambda. Author: Shuo Xiang <sxiang@twitter.com> Closes #1026 from coderxiang/als-reg and squashes the following commits: 93dfdb4 [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into als-reg b98f19c [Shuo Xiang] merge latest master 52c7b58 [Shuo Xiang] Apply user-specific regularization instead of uniform regularization in Alternating Least Squares (ALS)
Diffstat (limited to 'core')
0 files changed, 0 insertions, 0 deletions