aboutsummaryrefslogtreecommitdiff
path: root/R/pkg/inst/tests/testthat/test_mllib.R
diff options
context:
space:
mode:
authoractuaryzhang <actuaryzhang10@gmail.com>2016-12-07 16:37:25 +0800
committerSean Owen <sowen@cloudera.com>2016-12-07 16:37:25 +0800
commitb8280271396eb74638da6546d76bbb2d06c7011b (patch)
tree5f28b6743c029b9da2566b2fc6f295ed164e6c27 /R/pkg/inst/tests/testthat/test_mllib.R
parent90b59d1bf262b41c3a5f780697f504030f9d079c (diff)
downloadspark-b8280271396eb74638da6546d76bbb2d06c7011b.tar.gz
spark-b8280271396eb74638da6546d76bbb2d06c7011b.tar.bz2
spark-b8280271396eb74638da6546d76bbb2d06c7011b.zip
[SPARK-18701][ML] Fix Poisson GLM failure due to wrong initialization
Poisson GLM fails for many standard data sets (see example in test or JIRA). The issue is incorrect initialization leading to almost zero probability and weights. Specifically, the mean is initialized as the response, which could be zero. Applying the log link results in very negative numbers (protected against -Inf), which again leads to close to zero probability and weights in the weighted least squares. Fix and test are included in the commits. ## What changes were proposed in this pull request? Update initialization in Poisson GLM ## How was this patch tested? Add test in GeneralizedLinearRegressionSuite srowen sethah yanboliang HyukjinKwon mengxr Author: actuaryzhang <actuaryzhang10@gmail.com> Closes #16131 from actuaryzhang/master.
Diffstat (limited to 'R/pkg/inst/tests/testthat/test_mllib.R')
0 files changed, 0 insertions, 0 deletions