diff options
author | Yanbo Liang <ybliang8@gmail.com> | 2016-06-07 15:25:36 -0700 |
---|---|---|
committer | Yanbo Liang <ybliang8@gmail.com> | 2016-06-07 15:25:36 -0700 |
commit | 6ecedf39b44c9acd58cdddf1a31cf11e8e24428c (patch) | |
tree | 480604299bd07f81c1166d80214b8a1433ff95fd /docs/ml-classification-regression.md | |
parent | 890baaca5078df0b50c0054f55a2c33023f7fd67 (diff) | |
download | spark-6ecedf39b44c9acd58cdddf1a31cf11e8e24428c.tar.gz spark-6ecedf39b44c9acd58cdddf1a31cf11e8e24428c.tar.bz2 spark-6ecedf39b44c9acd58cdddf1a31cf11e8e24428c.zip |
[SPARK-13590][ML][DOC] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference
## What changes were proposed in this pull request?
When fitting ```LinearRegressionModel```(by "l-bfgs" solver) and ```LogisticRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce same model as R glmnet but different from LIBSVM.
When fitting ```AFTSurvivalRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce different model compared with R survival::survreg.
We should output a warning message and clarify in document for this condition.
## How was this patch tested?
Document change, no unit test.
cc mengxr
Author: Yanbo Liang <ybliang8@gmail.com>
Closes #12731 from yanboliang/spark-13590.
Diffstat (limited to 'docs/ml-classification-regression.md')
-rw-r--r-- | docs/ml-classification-regression.md | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/docs/ml-classification-regression.md b/docs/ml-classification-regression.md index ff8dec6d2d..88457d4bb1 100644 --- a/docs/ml-classification-regression.md +++ b/docs/ml-classification-regression.md @@ -62,6 +62,8 @@ For more background and more details about the implementation, refer to the docu > The current implementation of logistic regression in `spark.ml` only supports binary classes. Support for multiclass regression will be added in the future. + > When fitting LogisticRegressionModel without intercept on dataset with constant nonzero column, Spark MLlib outputs zero coefficients for constant nonzero columns. This behavior is the same as R glmnet but different from LIBSVM. + **Example** The following example shows how to train a logistic regression model @@ -351,6 +353,8 @@ Refer to the [Python API docs](api/python/pyspark.ml.html#pyspark.ml.classificat The interface for working with linear regression models and model summaries is similar to the logistic regression case. + > When fitting LinearRegressionModel without intercept on dataset with constant nonzero column by "l-bfgs" solver, Spark MLlib outputs zero coefficients for constant nonzero columns. This behavior is the same as R glmnet but different from LIBSVM. + **Example** The following @@ -666,6 +670,8 @@ The optimization algorithm underlying the implementation is L-BFGS. The implementation matches the result from R's survival function [survreg](https://stat.ethz.ch/R-manual/R-devel/library/survival/html/survreg.html) + > When fitting AFTSurvivalRegressionModel without intercept on dataset with constant nonzero column, Spark MLlib outputs zero coefficients for constant nonzero columns. This behavior is different from R survival::survreg. + **Example** <div class="codetabs"> |