aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/mllib
diff options
context:
space:
mode:
authorYanbo Liang <ybliang8@gmail.com>2016-02-22 23:37:09 -0800
committerXiangrui Meng <meng@databricks.com>2016-02-22 23:37:09 -0800
commit72427c3e115daf06f7ad8aa50115a8e0da2c6d62 (patch)
tree4f193b6e3d4ffcd30b08149aa2faed5fe08bf1ac /python/pyspark/mllib
parent4fd1993692d45a0da0289b8c7669cc1dc3fe0f2b (diff)
downloadspark-72427c3e115daf06f7ad8aa50115a8e0da2c6d62.tar.gz
spark-72427c3e115daf06f7ad8aa50115a8e0da2c6d62.tar.bz2
spark-72427c3e115daf06f7ad8aa50115a8e0da2c6d62.zip
[SPARK-13429][MLLIB] Unify Logistic Regression convergence tolerance of ML & MLlib
## What changes were proposed in this pull request? In order to provide better and consistent result, let's change the default value of MLlib ```LogisticRegressionWithLBFGS convergenceTol``` from ```1E-4``` to ```1E-6``` which will be equal to ML ```LogisticRegression```. cc dbtsai ## How was the this patch tested? unit tests Author: Yanbo Liang <ybliang8@gmail.com> Closes #11299 from yanboliang/spark-13429.
Diffstat (limited to 'python/pyspark/mllib')
-rw-r--r--python/pyspark/mllib/classification.py4
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/mllib/classification.py b/python/pyspark/mllib/classification.py
index b24592c379..b4d54ef61b 100644
--- a/python/pyspark/mllib/classification.py
+++ b/python/pyspark/mllib/classification.py
@@ -327,7 +327,7 @@ class LogisticRegressionWithLBFGS(object):
@classmethod
@since('1.2.0')
def train(cls, data, iterations=100, initialWeights=None, regParam=0.01, regType="l2",
- intercept=False, corrections=10, tolerance=1e-4, validateData=True, numClasses=2):
+ intercept=False, corrections=10, tolerance=1e-6, validateData=True, numClasses=2):
"""
Train a logistic regression model on the given data.
@@ -359,7 +359,7 @@ class LogisticRegressionWithLBFGS(object):
(default: 10)
:param tolerance:
The convergence tolerance of iterations for L-BFGS.
- (default: 1e-4)
+ (default: 1e-6)
:param validateData:
Boolean parameter which indicates if the algorithm should
validate data before training.