[SPARK-13028] [ML] Add MaxAbsScaler to ML.feature as a transformer - spark

diff options

author	Yuhao Yang <hhbyyh@gmail.com>	2016-02-25 21:04:35 -0800
committer	Xiangrui Meng <meng@databricks.com>	2016-02-25 21:04:35 -0800
commit	90d07154c2cef3d1095cb3caeafa7003218a3e49 (patch)
tree	fddca441bd27ab2ce4aea8fdbb32c5f6d9cb6dfc /python/pyspark/ml/regression.py
parent	1b39fafa75a162f183824ff2daa61d73b05ebc83 (diff)
download	spark-90d07154c2cef3d1095cb3caeafa7003218a3e49.tar.gz spark-90d07154c2cef3d1095cb3caeafa7003218a3e49.tar.bz2 spark-90d07154c2cef3d1095cb3caeafa7003218a3e49.zip

[SPARK-13028] [ML] Add MaxAbsScaler to ML.feature as a transformer

jira: https://issues.apache.org/jira/browse/SPARK-13028 MaxAbsScaler works in a very similar way as MinMaxScaler, but scales in a way that the training data lies within the range [-1, 1] by dividing through the largest maximum value in each feature. The motivation to use this scaling includes robustness to very small standard deviations of features and preserving zero entries in sparse data. Unlike StandardScaler and MinMaxScaler, MaxAbsScaler does not shift/center the data, and thus does not destroy any sparsity. Something similar from sklearn: http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html#sklearn.preprocessing.MaxAbsScaler Author: Yuhao Yang <hhbyyh@gmail.com> Closes #10939 from hhbyyh/maxabs and squashes the following commits: fd8bdcd [Yuhao Yang] add tag and some optimization on fit 648fced [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into maxabs 75bebc2 [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into maxabs cb10bb6 [Yuhao Yang] remove minmax 91ef8f3 [Yuhao Yang] ut added 8ab0747 [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into maxabs a9215b5 [Yuhao Yang] max abs scaler

Diffstat (limited to 'python/pyspark/ml/regression.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: