diff options
author | Yuhao Yang <hhbyyh@gmail.com> | 2015-07-31 11:50:15 -0700 |
---|---|---|
committer | Joseph K. Bradley <joseph@databricks.com> | 2015-07-31 11:50:15 -0700 |
commit | 4011a947154d97a9ffb5a71f077481a12534d36b (patch) | |
tree | b117215285eae619072afca425a0c35ec9b1d960 /dev/sparktestsupport/modules.py | |
parent | 6add4eddb39e7748a87da3e921ea3c7881d30a82 (diff) | |
download | spark-4011a947154d97a9ffb5a71f077481a12534d36b.tar.gz spark-4011a947154d97a9ffb5a71f077481a12534d36b.tar.bz2 spark-4011a947154d97a9ffb5a71f077481a12534d36b.zip |
[SPARK-9231] [MLLIB] DistributedLDAModel method for top topics per document
jira: https://issues.apache.org/jira/browse/SPARK-9231
Helper method in DistributedLDAModel of this form:
```
/**
* For each document, return the top k weighted topics for that document.
* return RDD of (doc ID, topic indices, topic weights)
*/
def topTopicsPerDocument(k: Int): RDD[(Long, Array[Int], Array[Double])]
```
Author: Yuhao Yang <hhbyyh@gmail.com>
Closes #7785 from hhbyyh/topTopicsPerdoc and squashes the following commits:
30ad153 [Yuhao Yang] small fix
fd24580 [Yuhao Yang] add topTopics per document to DistributedLDAModel
Diffstat (limited to 'dev/sparktestsupport/modules.py')
0 files changed, 0 insertions, 0 deletions