[SPARK-13132][MLLIB] cache standardization param value in LogisticRegression - spark

diff options

author	Gary King <gary@idibon.com>	2016-02-07 09:13:28 +0000
committer	Sean Owen <sowen@cloudera.com>	2016-02-07 09:13:28 +0000
commit	bc8890b357811612ba6c10d96374902b9e08134f (patch)
tree	de2ad39d76c48718a7faf5caa995ae6d259f51e2 /external/kafka
parent	81da3bee669aaeb79ec68baaf7c99bff6e5d14fe (diff)
download	spark-bc8890b357811612ba6c10d96374902b9e08134f.tar.gz spark-bc8890b357811612ba6c10d96374902b9e08134f.tar.bz2 spark-bc8890b357811612ba6c10d96374902b9e08134f.zip

[SPARK-13132][MLLIB] cache standardization param value in LogisticRegression

cache the value of the standardization Param in LogisticRegression, rather than re-fetching it from the ParamMap for every index and every optimization step in the quasi-newton optimizer also, fix Param#toString to cache the stringified representation, rather than re-interpolating it on every call, so any other implementations that have similar repeated access patterns will see a benefit. this change improves training times for one of my test sets from ~7m30s to ~4m30s Author: Gary King <gary@idibon.com> Closes #11027 from idigary/spark-13132-optimize-logistic-regression.

Diffstat (limited to 'external/kafka')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: