From 0b713e0455d01999d5a027ddc2ea8527eb085b34 Mon Sep 17 00:00:00 2001 From: Yuhao Yang Date: Fri, 11 Mar 2016 09:31:35 +0200 Subject: [SPARK-13512][ML] add example and doc for MaxAbsScaler ## What changes were proposed in this pull request? jira: https://issues.apache.org/jira/browse/SPARK-13512 Add example and doc for ml.feature.MaxAbsScaler. ## How was this patch tested? unit tests Author: Yuhao Yang Closes #11392 from hhbyyh/maxabsdoc. --- docs/ml-features.md | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) (limited to 'docs') diff --git a/docs/ml-features.md b/docs/ml-features.md index 68d3ea2971..4fe8eefc26 100644 --- a/docs/ml-features.md +++ b/docs/ml-features.md @@ -773,6 +773,38 @@ for more details on the API. + +## MaxAbsScaler + +`MaxAbsScaler` transforms a dataset of `Vector` rows, rescaling each feature to range [-1, 1] +by dividing through the maximum absolute value in each feature. It does not shift/center the +data, and thus does not destroy any sparsity. + +`MaxAbsScaler` computes summary statistics on a data set and produces a `MaxAbsScalerModel`. The +model can then transform each feature individually to range [-1, 1]. + +The following example demonstrates how to load a dataset in libsvm format and then rescale each feature to [-1, 1]. + +
+
+ +Refer to the [MaxAbsScaler Scala docs](api/scala/index.html#org.apache.spark.ml.feature.MaxAbsScaler) +and the [MaxAbsScalerModel Scala docs](api/scala/index.html#org.apache.spark.ml.feature.MaxAbsScalerModel) +for more details on the API. + +{% include_example scala/org/apache/spark/examples/ml/MaxAbsScalerExample.scala %} +
+ +
+ +Refer to the [MaxAbsScaler Java docs](api/java/org/apache/spark/ml/feature/MaxAbsScaler.html) +and the [MaxAbsScalerModel Java docs](api/java/org/apache/spark/ml/feature/MaxAbsScalerModel.html) +for more details on the API. + +{% include_example java/org/apache/spark/examples/ml/JavaMaxAbsScalerExample.java %} +
+
+ ## Bucketizer `Bucketizer` transforms a column of continuous features to a column of feature buckets, where the buckets are specified by users. It takes a parameter: -- cgit v1.2.3