diff options
author | Feynman Liang <fliang@databricks.com> | 2015-08-19 11:35:05 -0700 |
---|---|---|
committer | Joseph K. Bradley <joseph@databricks.com> | 2015-08-19 11:35:05 -0700 |
commit | 28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee (patch) | |
tree | 35c507135016a31b2157b15cc1de9aa67212dfe3 /python | |
parent | 5fd53c64bb01de74ae57a7068de85b34adc856cf (diff) | |
download | spark-28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee.tar.gz spark-28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee.tar.bz2 spark-28a98464ea65aa7b35e24fca5ddaa60c2e5d53ee.zip |
[SPARK-10097] Adds `shouldMaximize` flag to `ml.evaluation.Evaluator`
Previously, users of evaluator (`CrossValidator` and `TrainValidationSplit`) would only maximize the metric in evaluator, leading to a hacky solution which negated metrics to be minimized and caused erroneous negative values to be reported to the user.
This PR adds a `isLargerBetter` attribute to the `Evaluator` base class, instructing users of `Evaluator` on whether the chosen metric should be maximized or minimized.
CC jkbradley
Author: Feynman Liang <fliang@databricks.com>
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #8290 from feynmanliang/SPARK-10097.
Diffstat (limited to 'python')
-rw-r--r-- | python/pyspark/ml/evaluation.py | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/ml/evaluation.py b/python/pyspark/ml/evaluation.py index e23ce053ba..6b0a9ffde9 100644 --- a/python/pyspark/ml/evaluation.py +++ b/python/pyspark/ml/evaluation.py @@ -163,11 +163,11 @@ class RegressionEvaluator(JavaEvaluator, HasLabelCol, HasPredictionCol): ... >>> evaluator = RegressionEvaluator(predictionCol="raw") >>> evaluator.evaluate(dataset) - -2.842... + 2.842... >>> evaluator.evaluate(dataset, {evaluator.metricName: "r2"}) 0.993... >>> evaluator.evaluate(dataset, {evaluator.metricName: "mae"}) - -2.649... + 2.649... """ # Because we will maximize evaluation value (ref: `CrossValidator`), # when we evaluate a metric that is needed to minimize (e.g., `"rmse"`, `"mse"`, `"mae"`), |