diff options
author | DB Tsai <dbtsai@alpinenow.com> | 2014-12-22 16:42:55 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-12-22 16:42:55 -0800 |
commit | a96b72781ae40bb303613990b8d8b4721b84e1c3 (patch) | |
tree | 69ed3021cbc056f925c7214a824c1ade622ad878 /bin | |
parent | c233ab3d8d75a33495298964fe73dbf7dd8fe305 (diff) | |
download | spark-a96b72781ae40bb303613990b8d8b4721b84e1c3.tar.gz spark-a96b72781ae40bb303613990b8d8b4721b84e1c3.tar.bz2 spark-a96b72781ae40bb303613990b8d8b4721b84e1c3.zip |
[SPARK-4907][MLlib] Inconsistent loss and gradient in LeastSquaresGradient compared with R
In most of the academic paper and algorithm implementations,
people use L = 1/2n ||A weights-y||^2 instead of L = 1/n ||A weights-y||^2
for least-squared loss. See Eq. (1) in http://web.stanford.edu/~hastie/Papers/glmnet.pdf
Since MLlib uses different convention, this will result different residuals and
all the stats properties will be different from GLMNET package in R.
The model coefficients will be still the same under this change.
Author: DB Tsai <dbtsai@alpinenow.com>
Closes #3746 from dbtsai/lir and squashes the following commits:
19c2e85 [DB Tsai] make stepsize twice to converge to the same solution
0b2c29c [DB Tsai] first commit
Diffstat (limited to 'bin')
0 files changed, 0 insertions, 0 deletions