aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/ml/regression.py
diff options
context:
space:
mode:
authorYuhao Yang <hhbyyh@gmail.com>2016-02-25 21:04:35 -0800
committerXiangrui Meng <meng@databricks.com>2016-02-25 21:04:35 -0800
commit90d07154c2cef3d1095cb3caeafa7003218a3e49 (patch)
treefddca441bd27ab2ce4aea8fdbb32c5f6d9cb6dfc /python/pyspark/ml/regression.py
parent1b39fafa75a162f183824ff2daa61d73b05ebc83 (diff)
downloadspark-90d07154c2cef3d1095cb3caeafa7003218a3e49.tar.gz
spark-90d07154c2cef3d1095cb3caeafa7003218a3e49.tar.bz2
spark-90d07154c2cef3d1095cb3caeafa7003218a3e49.zip
[SPARK-13028] [ML] Add MaxAbsScaler to ML.feature as a transformer
jira: https://issues.apache.org/jira/browse/SPARK-13028 MaxAbsScaler works in a very similar way as MinMaxScaler, but scales in a way that the training data lies within the range [-1, 1] by dividing through the largest maximum value in each feature. The motivation to use this scaling includes robustness to very small standard deviations of features and preserving zero entries in sparse data. Unlike StandardScaler and MinMaxScaler, MaxAbsScaler does not shift/center the data, and thus does not destroy any sparsity. Something similar from sklearn: http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html#sklearn.preprocessing.MaxAbsScaler Author: Yuhao Yang <hhbyyh@gmail.com> Closes #10939 from hhbyyh/maxabs and squashes the following commits: fd8bdcd [Yuhao Yang] add tag and some optimization on fit 648fced [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into maxabs 75bebc2 [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into maxabs cb10bb6 [Yuhao Yang] remove minmax 91ef8f3 [Yuhao Yang] ut added 8ab0747 [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into maxabs a9215b5 [Yuhao Yang] max abs scaler
Diffstat (limited to 'python/pyspark/ml/regression.py')
0 files changed, 0 insertions, 0 deletions