diff options
author | Yanbo Liang <ybliang8@gmail.com> | 2016-08-09 03:39:57 -0700 |
---|---|---|
committer | Yanbo Liang <ybliang8@gmail.com> | 2016-08-09 03:39:57 -0700 |
commit | 182e11904bf2093c2faa57894a1c4bb11d872596 (patch) | |
tree | ed32964fc35e5626ccc698de03a67d30d2e3c0d0 /docs/ml-decision-tree.md | |
parent | 511f52f8423e151b0d0133baf040d34a0af3d422 (diff) | |
download | spark-182e11904bf2093c2faa57894a1c4bb11d872596.tar.gz spark-182e11904bf2093c2faa57894a1c4bb11d872596.tar.bz2 spark-182e11904bf2093c2faa57894a1c4bb11d872596.zip |
[SPARK-16933][ML] Fix AFTAggregator in AFTSurvivalRegression serializes unnecessary data.
## What changes were proposed in this pull request?
Similar to ```LeastSquaresAggregator``` in #14109, ```AFTAggregator``` used for ```AFTSurvivalRegression``` ends up serializing the ```parameters``` and ```featuresStd```, which is not necessary and can cause performance issues for high dimensional data. This patch removes this serialization. This PR is highly inspired by #14109.
## How was this patch tested?
I tested this locally and verified the serialization reduction.
Before patch
![image](https://cloud.githubusercontent.com/assets/1962026/17512035/abb93f04-5dda-11e6-97d3-8ae6b61a0dfd.png)
After patch
![image](https://cloud.githubusercontent.com/assets/1962026/17512024/9e0dc44c-5dda-11e6-93d0-6e130ba0d6aa.png)
Author: Yanbo Liang <ybliang8@gmail.com>
Closes #14519 from yanboliang/spark-16933.
Diffstat (limited to 'docs/ml-decision-tree.md')
0 files changed, 0 insertions, 0 deletions