aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/ml/evaluation.py
diff options
context:
space:
mode:
authorFeynman Liang <fliang@databricks.com>2015-08-19 11:35:05 -0700
committerJoseph K. Bradley <joseph@databricks.com>2015-08-19 11:35:05 -0700
commit28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee (patch)
tree35c507135016a31b2157b15cc1de9aa67212dfe3 /python/pyspark/ml/evaluation.py
parent5fd53c64bb01de74ae57a7068de85b34adc856cf (diff)
downloadspark-28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee.tar.gz
spark-28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee.tar.bz2
spark-28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee.zip
[SPARK-10097] Adds `shouldMaximize` flag to `ml.evaluation.Evaluator`
Previously, users of evaluator (`CrossValidator` and `TrainValidationSplit`) would only maximize the metric in evaluator, leading to a hacky solution which negated metrics to be minimized and caused erroneous negative values to be reported to the user. This PR adds a `isLargerBetter` attribute to the `Evaluator` base class, instructing users of `Evaluator` on whether the chosen metric should be maximized or minimized. CC jkbradley Author: Feynman Liang <fliang@databricks.com> Author: Joseph K. Bradley <joseph@databricks.com> Closes #8290 from feynmanliang/SPARK-10097.
Diffstat (limited to 'python/pyspark/ml/evaluation.py')
-rw-r--r--python/pyspark/ml/evaluation.py4
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/ml/evaluation.py b/python/pyspark/ml/evaluation.py
index e23ce053ba..6b0a9ffde9 100644
--- a/python/pyspark/ml/evaluation.py
+++ b/python/pyspark/ml/evaluation.py
@@ -163,11 +163,11 @@ class RegressionEvaluator(JavaEvaluator, HasLabelCol, HasPredictionCol):
...
>>> evaluator = RegressionEvaluator(predictionCol="raw")
>>> evaluator.evaluate(dataset)
- -2.842...
+ 2.842...
>>> evaluator.evaluate(dataset, {evaluator.metricName: "r2"})
0.993...
>>> evaluator.evaluate(dataset, {evaluator.metricName: "mae"})
- -2.649...
+ 2.649...
"""
# Because we will maximize evaluation value (ref: `CrossValidator`),
# when we evaluate a metric that is needed to minimize (e.g., `"rmse"`, `"mse"`, `"mae"`),