diff options
author | WeichenXu <WeichenXu123@outlook.com> | 2016-07-26 10:41:41 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-07-26 10:41:41 +0100 |
commit | 4c9695598ee00f68aff4eb32d4629edf6facb29f (patch) | |
tree | e254af39c8bce081b34cd2c21995d3962f61d1fe /core | |
parent | 3b2b785ece4394ca332377647a6305ea493f411b (diff) | |
download | spark-4c9695598ee00f68aff4eb32d4629edf6facb29f.tar.gz spark-4c9695598ee00f68aff4eb32d4629edf6facb29f.tar.bz2 spark-4c9695598ee00f68aff4eb32d4629edf6facb29f.zip |
[SPARK-16697][ML][MLLIB] improve LDA submitMiniBatch method to avoid redundant RDD computation
## What changes were proposed in this pull request?
In `LDAOptimizer.submitMiniBatch`, do persist on `stats: RDD[(BDM[Double], List[BDV[Double]])]`
and also move the place of unpersisting `expElogbetaBc` broadcast variable,
to avoid the `expElogbetaBc` broadcast variable to be unpersisted too early,
and update previous `expElogbetaBc.unpersist()` into `expElogbetaBc.destroy(false)`
## How was this patch tested?
Existing test.
Author: WeichenXu <WeichenXu123@outlook.com>
Closes #14335 from WeichenXu123/improve_LDA.
Diffstat (limited to 'core')
0 files changed, 0 insertions, 0 deletions