aboutsummaryrefslogtreecommitdiff
path: root/mllib
diff options
context:
space:
mode:
authorYanbo Liang <ybliang8@gmail.com>2016-02-29 00:55:51 -0800
committerDB Tsai <dbt@netflix.com>2016-02-29 00:55:51 -0800
commitd81a71357e24160244b6eeff028b0d9a4863becf (patch)
tree0d5f6bdde7ce4edbe45883a908d80ab292845eb6 /mllib
parentdd3b5455c61bddce96a94c2ce8f5d76ed4948ea1 (diff)
downloadspark-d81a71357e24160244b6eeff028b0d9a4863becf.tar.gz
spark-d81a71357e24160244b6eeff028b0d9a4863becf.tar.bz2
spark-d81a71357e24160244b6eeff028b0d9a4863becf.zip
[SPARK-13545][MLLIB][PYSPARK] Make MLlib LogisticRegressionWithLBFGS's default parameters consistent in Scala and Python
## What changes were proposed in this pull request? * The default value of ```regParam``` of PySpark MLlib ```LogisticRegressionWithLBFGS``` should be consistent with Scala which is ```0.0```. (This is also consistent with ML ```LogisticRegression```.) * BTW, if we use a known updater(L1 or L2) for binary classification, ```LogisticRegressionWithLBFGS``` will call the ML implementation. We should update the API doc to clarifying ```numCorrections``` will have no effect if we fall into that route. * Make a pass for all parameters of ```LogisticRegressionWithLBFGS```, others are set properly. cc mengxr dbtsai ## How was this patch tested? No new tests, it should pass all current tests. Author: Yanbo Liang <ybliang8@gmail.com> Closes #11424 from yanboliang/spark-13545.
Diffstat (limited to 'mllib')
-rw-r--r--mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala4
1 files changed, 4 insertions, 0 deletions
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala b/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala
index c3882606d7..f807b5683c 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala
@@ -408,6 +408,10 @@ class LogisticRegressionWithLBFGS
* defaults to the mllib implementation. If more than two classes
* or feature scaling is disabled, always uses mllib implementation.
* Uses user provided weights.
+ *
+ * In the ml LogisticRegression implementation, the number of corrections
+ * used in the LBFGS update can not be configured. So `optimizer.setNumCorrections()`
+ * will have no effect if we fall into that route.
*/
override def run(input: RDD[LabeledPoint], initialWeights: Vector): LogisticRegressionModel = {
run(input, initialWeights, userSuppliedWeights = true)