1 files changed, 4 insertions, 0 deletions
diff --git a/docs/mllib-clustering.md b/docs/mllib-clustering.md
index 3fb35d3c50..c2711cf82d 100644
--- a/docs/mllib-clustering.md
+++ b/docs/mllib-clustering.md
@@ -507,6 +507,10 @@ must also be $> 1.0$. Providing `Vector(-1)` results in default behavior
 $> 1.0$. Providing `-1` results in defaulting to a value of $0.1 + 1$.
 * `maxIterations`: The maximum number of EM iterations.
 
+*Note*: It is important to do enough iterations.  In early iterations, EM often has useless topics,
+but those topics improve dramatically after more iterations.  Using at least 20 and possibly
+50-100 iterations is often reasonable, depending on your dataset.
+
 `EMLDAOptimizer` produces a `DistributedLDAModel`, which stores not only
 the inferred topics but also the full training corpus and topic
 distributions for each document in the training corpus. A