aboutsummaryrefslogtreecommitdiff
path: root/docs/streaming-kinesis-integration.md
diff options
context:
space:
mode:
authorYanbo Liang <ybliang8@gmail.com>2016-09-29 04:30:42 -0700
committerYanbo Liang <ybliang8@gmail.com>2016-09-29 04:30:42 -0700
commitf7082ac12518ae84d6d1d4b7330a9f12cf95e7c1 (patch)
treec657915e4a09298fb6e8ca77d127bbfd3f7c35e3 /docs/streaming-kinesis-integration.md
parenta19a1bb59411177caaf99581e89098826b7d0c7b (diff)
downloadspark-f7082ac12518ae84d6d1d4b7330a9f12cf95e7c1.tar.gz
spark-f7082ac12518ae84d6d1d4b7330a9f12cf95e7c1.tar.bz2
spark-f7082ac12518ae84d6d1d4b7330a9f12cf95e7c1.zip
[SPARK-17704][ML][MLLIB] ChiSqSelector performance improvement.
## What changes were proposed in this pull request? Several performance improvement for ```ChiSqSelector```: 1, Keep ```selectedFeatures``` ordered ascendent. ```ChiSqSelectorModel.transform``` need ```selectedFeatures``` ordered to make prediction. We should sort it when training model rather than making prediction, since users usually train model once and use the model to do prediction multiple times. 2, When training ```fpr``` type ```ChiSqSelectorModel```, it's not necessary to sort the ChiSq test result by statistic. ## How was this patch tested? Existing unit tests. Author: Yanbo Liang <ybliang8@gmail.com> Closes #15277 from yanboliang/spark-17704.
Diffstat (limited to 'docs/streaming-kinesis-integration.md')
0 files changed, 0 insertions, 0 deletions