diff options
author | Yanbo Liang <ybliang8@gmail.com> | 2016-02-29 00:55:51 -0800 |
---|---|---|
committer | DB Tsai <dbt@netflix.com> | 2016-02-29 00:55:51 -0800 |
commit | d81a71357e24160244b6eeff028b0d9a4863becf (patch) | |
tree | 0d5f6bdde7ce4edbe45883a908d80ab292845eb6 /mllib | |
parent | dd3b5455c61bddce96a94c2ce8f5d76ed4948ea1 (diff) | |
download | spark-d81a71357e24160244b6eeff028b0d9a4863becf.tar.gz spark-d81a71357e24160244b6eeff028b0d9a4863becf.tar.bz2 spark-d81a71357e24160244b6eeff028b0d9a4863becf.zip |
[SPARK-13545][MLLIB][PYSPARK] Make MLlib LogisticRegressionWithLBFGS's default parameters consistent in Scala and Python
## What changes were proposed in this pull request?
* The default value of ```regParam``` of PySpark MLlib ```LogisticRegressionWithLBFGS``` should be consistent with Scala which is ```0.0```. (This is also consistent with ML ```LogisticRegression```.)
* BTW, if we use a known updater(L1 or L2) for binary classification, ```LogisticRegressionWithLBFGS``` will call the ML implementation. We should update the API doc to clarifying ```numCorrections``` will have no effect if we fall into that route.
* Make a pass for all parameters of ```LogisticRegressionWithLBFGS```, others are set properly.
cc mengxr dbtsai
## How was this patch tested?
No new tests, it should pass all current tests.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes #11424 from yanboliang/spark-13545.
Diffstat (limited to 'mllib')
-rw-r--r-- | mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala b/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala index c3882606d7..f807b5683c 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala @@ -408,6 +408,10 @@ class LogisticRegressionWithLBFGS * defaults to the mllib implementation. If more than two classes * or feature scaling is disabled, always uses mllib implementation. * Uses user provided weights. + * + * In the ml LogisticRegression implementation, the number of corrections + * used in the LBFGS update can not be configured. So `optimizer.setNumCorrections()` + * will have no effect if we fall into that route. */ override def run(input: RDD[LabeledPoint], initialWeights: Vector): LogisticRegressionModel = { run(input, initialWeights, userSuppliedWeights = true) |