[SPARK-18456][ML][FOLLOWUP] Use matrix abstraction for coefficients in LogisticRegression training - spark

diff options

author	sethah <seth.hendrickson16@gmail.com>	2016-11-20 01:42:37 +0000
committer	DB Tsai <dbtsai@dbtsai.com>	2016-11-20 01:42:37 +0000
commit	856e0042007c789dda4539fb19a5d4580999fbf4 (patch)
tree	25c67679bce2bec591dd0f739ba265660a29c5af /assembly
parent	ea77c81ec0db27ea4709f71dc080d00167505a7d (diff)
download	spark-856e0042007c789dda4539fb19a5d4580999fbf4.tar.gz spark-856e0042007c789dda4539fb19a5d4580999fbf4.tar.bz2 spark-856e0042007c789dda4539fb19a5d4580999fbf4.zip

[SPARK-18456][ML][FOLLOWUP] Use matrix abstraction for coefficients in LogisticRegression training

## What changes were proposed in this pull request? This is a follow up to some of the discussion [here](https://github.com/apache/spark/pull/15593). During LogisticRegression training, we store the coefficients combined with intercepts as a flat vector, but a more natural abstraction is a matrix. Here, we refactor the code to use matrix where possible, which makes the code more readable and greatly simplifies the indexing. Note: We do not use a Breeze matrix for the cost function as was mentioned in the linked PR. This is because LBFGS/OWLQN require an implicit `MutableInnerProductModule[DenseMatrix[Double], Double]` which is not natively defined in Breeze. We would need to extend Breeze in Spark to define it ourselves. Also, we do not modify the `regParamL1Fun` because OWLQN in Breeze requires a `MutableEnumeratedCoordinateField[(Int, Int), DenseVector[Double]]` (since we still use a dense vector for coefficients). Here again we would have to extend Breeze inside Spark. ## How was this patch tested? This is internal code refactoring - the current unit tests passing show us that the change did not break anything. No added functionality in this patch. Author: sethah <seth.hendrickson16@gmail.com> Closes #15893 from sethah/logreg_refactor.

Diffstat (limited to 'assembly')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: