[SPARK-15617][ML][DOC] Clarify that fMeasure in MulticlassMetrics is "micro" f1_score

## What changes were proposed in this pull request? 1, del precision,recall in `ml.MulticlassClassificationEvaluator` 2, update user guide for `mlllib.weightedFMeasure` ## How was this patch tested? local build Author: Ruifeng Zheng <ruifengz@foxmail.com> Closes #13390 from zhengruifeng/clarify_f1.
author: Ruifeng Zheng <ruifengz@foxmail.com> 2016-06-04 13:56:04 +0100
committer: Sean Owen <sowen@cloudera.com> 2016-06-04 13:56:04 +0100
commit: 2099e05f93067937cdf6cedcf493afd66e212abe (patch)
tree: df00189031ecedfea74cd07e60c6542e4cc894dc /docs/mllib-evaluation-metrics.md
parent: 2ca563cc45d1ac1c19b8e84c5a87a950c712ab87 (diff)
download: spark-2099e05f93067937cdf6cedcf493afd66e212abe.tar.gz
spark-2099e05f93067937cdf6cedcf493afd66e212abe.tar.bz2
spark-2099e05f93067937cdf6cedcf493afd66e212abe.zip
1 files changed, 3 insertions, 13 deletions
diff --git a/docs/mllib-evaluation-metrics.md b/docs/mllib-evaluation-metrics.md
index a269dbf030..c49bc4ff12 100644
--- a/docs/mllib-evaluation-metrics.md
+++ b/docs/mllib-evaluation-metrics.md
@@ -140,7 +140,7 @@ definitions of positive and negative labels is straightforward.
 #### Label based metrics
 
 Opposed to binary classification where there are only two possible labels, multiclass classification problems have many
-possible labels and so the concept of label-based metrics is introduced. Overall precision measures precision across all
+possible labels and so the concept of label-based metrics is introduced. Accuracy measures precision across all
 labels -  the number of times any class was predicted correctly (true positives) normalized by the number of data
 points. Precision by label considers only one class, and measures the number of time a specific label was predicted
 correctly normalized by the number of times that label appears in the output.
@@ -182,21 +182,11 @@ $$\hat{\delta}(x) = \begin{cases}1 & \text{if $x = 0$}, \\ 0 & \text{otherwise}.
       </td>
     </tr>
     <tr>
-      <td>Overall Precision</td>
-      <td>$PPV = \frac{TP}{TP + FP} = \frac{1}{N}\sum_{i=0}^{N-1} \hat{\delta}\left(\hat{\mathbf{y}}_i -
-        \mathbf{y}_i\right)$</td>
-    </tr>
-    <tr>
-      <td>Overall Recall</td>
-      <td>$TPR = \frac{TP}{TP + FN} = \frac{1}{N}\sum_{i=0}^{N-1} \hat{\delta}\left(\hat{\mathbf{y}}_i -
+      <td>Accuracy</td>
+      <td>$ACC = \frac{TP}{TP + FP} = \frac{1}{N}\sum_{i=0}^{N-1} \hat{\delta}\left(\hat{\mathbf{y}}_i -
         \mathbf{y}_i\right)$</td>
     </tr>
     <tr>
-      <td>Overall F1-measure</td>
-      <td>$F1 = 2 \cdot \left(\frac{PPV \cdot TPR}
-          {PPV + TPR}\right)$</td>
-    </tr>
-    <tr>
       <td>Precision by label</td>
       <td>$PPV(\ell) = \frac{TP}{TP + FP} =
           \frac{\sum_{i=0}^{N-1} \hat{\delta}(\hat{\mathbf{y}}_i - \ell) \cdot \hat{\delta}(\mathbf{y}_i - \ell)}
author	Ruifeng Zheng <ruifengz@foxmail.com>	2016-06-04 13:56:04 +0100
committer	Sean Owen <sowen@cloudera.com>	2016-06-04 13:56:04 +0100
commit	2099e05f93067937cdf6cedcf493afd66e212abe (patch)
tree	df00189031ecedfea74cd07e60c6542e4cc894dc /docs/mllib-evaluation-metrics.md
parent	2ca563cc45d1ac1c19b8e84c5a87a950c712ab87 (diff)
download	spark-2099e05f93067937cdf6cedcf493afd66e212abe.tar.gz spark-2099e05f93067937cdf6cedcf493afd66e212abe.tar.bz2 spark-2099e05f93067937cdf6cedcf493afd66e212abe.zip