diff options
author | Naftali Harris <naftaliharris@gmail.com> | 2014-07-30 09:56:59 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-07-30 09:56:59 -0700 |
commit | e3d85b7e40073b05e2588583e9d8db11366c2f7b (patch) | |
tree | 8691dfd4ee050bbc60ffa3489c9b1b188bb1807a /python | |
parent | 3bc3f1801e3347e02cbecdd8e941003430155da2 (diff) | |
download | spark-e3d85b7e40073b05e2588583e9d8db11366c2f7b.tar.gz spark-e3d85b7e40073b05e2588583e9d8db11366c2f7b.tar.bz2 spark-e3d85b7e40073b05e2588583e9d8db11366c2f7b.zip |
Avoid numerical instability
This avoids basically doing 1 - 1, for example:
```python
>>> from math import exp
>>> margin = -40
>>> 1 - 1 / (1 + exp(margin))
0.0
>>> exp(margin) / (1 + exp(margin))
4.248354255291589e-18
>>>
```
Author: Naftali Harris <naftaliharris@gmail.com>
Closes #1652 from naftaliharris/patch-2 and squashes the following commits:
0d55a9f [Naftali Harris] Avoid numerical instability
Diffstat (limited to 'python')
-rw-r--r-- | python/pyspark/mllib/classification.py | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/python/pyspark/mllib/classification.py b/python/pyspark/mllib/classification.py index 9e28dfbb91..2bbb9c3fca 100644 --- a/python/pyspark/mllib/classification.py +++ b/python/pyspark/mllib/classification.py @@ -66,7 +66,8 @@ class LogisticRegressionModel(LinearModel): if margin > 0: prob = 1 / (1 + exp(-margin)) else: - prob = 1 - 1 / (1 + exp(margin)) + exp_margin = exp(margin) + prob = exp_margin / (1 + exp_margin) return 1 if prob > 0.5 else 0 |