diff options
author | DB Tsai <dbtsai@alpinenow.com> | 2014-04-15 11:12:47 -0700 |
---|---|---|
committer | Patrick Wendell <pwendell@gmail.com> | 2014-04-15 11:12:47 -0700 |
commit | 6843d637e72e5262d05cfa2b1935152743f2bd5a (patch) | |
tree | 7dfb8d7d01b8a729c634d0cf285698de4bc9c75c /dev | |
parent | 2580a3b1a06188fa97d9440d793c8835ef7384b0 (diff) | |
download | spark-6843d637e72e5262d05cfa2b1935152743f2bd5a.tar.gz spark-6843d637e72e5262d05cfa2b1935152743f2bd5a.tar.bz2 spark-6843d637e72e5262d05cfa2b1935152743f2bd5a.zip |
[SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation.
This PR uses Breeze's L-BFGS implement, and Breeze dependency has already been introduced by Xiangrui's sparse input format work in SPARK-1212. Nice work, @mengxr !
When use with regularized updater, we need compute the regVal and regGradient (the gradient of regularized part in the cost function), and in the currently updater design, we can compute those two values by the following way.
Let's review how updater works when returning newWeights given the input parameters.
w' = w - thisIterStepSize * (gradient + regGradient(w)) Note that regGradient is function of w!
If we set gradient = 0, thisIterStepSize = 1, then
regGradient(w) = w - w'
As a result, for regVal, it can be computed by
val regVal = updater.compute(
weights,
new DoubleMatrix(initialWeights.length, 1), 0, 1, regParam)._2
and for regGradient, it can be obtained by
val regGradient = weights.sub(
updater.compute(weights, new DoubleMatrix(initialWeights.length, 1), 1, 1, regParam)._1)
The PR includes the tests which compare the result with SGD with/without regularization.
We did a comparison between LBFGS and SGD, and often we saw 10x less
steps in LBFGS while the cost of per step is the same (just computing
the gradient).
The following is the paper by Prof. Ng at Stanford comparing different
optimizers including LBFGS and SGD. They use them in the context of
deep learning, but worth as reference.
http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf
Author: DB Tsai <dbtsai@alpinenow.com>
Closes #353 from dbtsai/dbtsai-LBFGS and squashes the following commits:
984b18e [DB Tsai] L-BFGS Optimizer based on Breeze's implementation. Also fixed indentation issue in GradientDescent optimizer.
Diffstat (limited to 'dev')
0 files changed, 0 insertions, 0 deletions