aboutsummaryrefslogtreecommitdiff
path: root/mllib
diff options
context:
space:
mode:
authorYanbo Liang <ybliang8@gmail.com>2016-07-27 11:24:28 +0100
committerSean Owen <sowen@cloudera.com>2016-07-27 11:24:28 +0100
commit3c3371bbd6361011b138cce88f6396a2aa4e2cb9 (patch)
treeb69e55046df02fd004830517f0a1b388c84ad823 /mllib
parentef0ccbcb07252db0ead8509e70d1a9a670d41616 (diff)
downloadspark-3c3371bbd6361011b138cce88f6396a2aa4e2cb9.tar.gz
spark-3c3371bbd6361011b138cce88f6396a2aa4e2cb9.tar.bz2
spark-3c3371bbd6361011b138cce88f6396a2aa4e2cb9.zip
[MINOR][ML] Fix some mistake in LinearRegression formula.
## What changes were proposed in this pull request? Fix some mistake in ```LinearRegression``` formula. ## How was this patch tested? Documents change, no tests. Author: Yanbo Liang <ybliang8@gmail.com> Closes #14369 from yanboliang/LiR-formula.
Diffstat (limited to 'mllib')
-rw-r--r--mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala6
1 files changed, 3 insertions, 3 deletions
diff --git a/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala b/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala
index a0ff7f07aa..f3dc65e0df 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala
@@ -800,16 +800,16 @@ class LinearRegressionSummary private[regression] (
* {{{
* \frac{\partial L}{\partial w_i} =
* 1/N \sum_j diff_j (x_{ij} - \bar{x_i}) / \hat{x_i}
- * = 1/N ((\sum_j diff_j x_{ij} / \hat{x_i}) - diffSum \bar{x_i}) / \hat{x_i})
+ * = 1/N ((\sum_j diff_j x_{ij} / \hat{x_i}) - diffSum \bar{x_i} / \hat{x_i})
* = 1/N ((\sum_j diff_j x_{ij} / \hat{x_i}) + correction_i)
* }}},
- * where correction_i = - diffSum \bar{x_i}) / \hat{x_i}
+ * where correction_i = - diffSum \bar{x_i} / \hat{x_i}
*
* A simple math can show that diffSum is actually zero, so we don't even
* need to add the correction terms in the end. From the definition of diff,
* {{{
* diffSum = \sum_j (\sum_i w_i(x_{ij} - \bar{x_i}) / \hat{x_i} - (y_j - \bar{y}) / \hat{y})
- * = N * (\sum_i w_i(\bar{x_i} - \bar{x_i}) / \hat{x_i} - (\bar{y_j} - \bar{y}) / \hat{y})
+ * = N * (\sum_i w_i(\bar{x_i} - \bar{x_i}) / \hat{x_i} - (\bar{y} - \bar{y}) / \hat{y})
* = 0
* }}}
*