diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/mllib-clustering.md | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/mllib-clustering.md b/docs/mllib-clustering.md index 3aad4149f9..d72dc20a5a 100644 --- a/docs/mllib-clustering.md +++ b/docs/mllib-clustering.md @@ -447,7 +447,7 @@ It supports different inference algorithms via `setOptimizer` function. EMLDAOpt on the likelihood function and yields comprehensive results, while OnlineLDAOptimizer uses iterative mini-batch sampling for [online variational inference](https://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf) and is generally memory friendly. After fitting on the documents, LDA provides: * Topics: Inferred topics, each of which is a probability distribution over terms (words). -* Topic distributions for documents: For each document in the training set, LDA gives a probability distribution over topics. (EM only) +* Topic distributions for documents: For each non empty document in the training set, LDA gives a probability distribution over topics. (EM only). Note that for empty documents, we don't create the topic distributions. (EM only) LDA takes the following parameters: |