diff options
author | Yuhao Yang <hhbyyh@gmail.com> | 2015-11-20 09:57:09 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-11-20 09:57:09 -0800 |
commit | e359d5dcf5bd300213054ebeae9fe75c4f7eb9e7 (patch) | |
tree | f008aa43b00a40ecb4ebc20f47b9de6a57d47702 /docs | |
parent | 9ace2e5c8d7fbd360a93bc5fc4eace64a697b44f (diff) | |
download | spark-e359d5dcf5bd300213054ebeae9fe75c4f7eb9e7.tar.gz spark-e359d5dcf5bd300213054ebeae9fe75c4f7eb9e7.tar.bz2 spark-e359d5dcf5bd300213054ebeae9fe75c4f7eb9e7.zip |
[SPARK-11689][ML] Add user guide and example code for LDA under spark.ml
jira: https://issues.apache.org/jira/browse/SPARK-11689
Add simple user guide for LDA under spark.ml and example code under examples/. Use include_example to include example code in the user guide markdown. Check SPARK-11606 for instructions.
Author: Yuhao Yang <hhbyyh@gmail.com>
Closes #9722 from hhbyyh/ldaMLExample.
Diffstat (limited to 'docs')
-rw-r--r-- | docs/ml-clustering.md | 30 | ||||
-rw-r--r-- | docs/ml-guide.md | 3 | ||||
-rw-r--r-- | docs/mllib-guide.md | 1 |
3 files changed, 33 insertions, 1 deletions
diff --git a/docs/ml-clustering.md b/docs/ml-clustering.md new file mode 100644 index 0000000000..1743ef43a6 --- /dev/null +++ b/docs/ml-clustering.md @@ -0,0 +1,30 @@ +--- +layout: global +title: Clustering - ML +displayTitle: <a href="ml-guide.html">ML</a> - Clustering +--- + +In this section, we introduce the pipeline API for [clustering in mllib](mllib-clustering.html). + +## Latent Dirichlet allocation (LDA) + +`LDA` is implemented as an `Estimator` that supports both `EMLDAOptimizer` and `OnlineLDAOptimizer`, +and generates a `LDAModel` as the base models. Expert users may cast a `LDAModel` generated by +`EMLDAOptimizer` to a `DistributedLDAModel` if needed. + +<div class="codetabs"> + +Refer to the [Scala API docs](api/scala/index.html#org.apache.spark.ml.clustering.LDA) for more details. + +<div data-lang="scala" markdown="1"> +{% include_example scala/org/apache/spark/examples/ml/LDAExample.scala %} +</div> + +<div data-lang="java" markdown="1"> + +Refer to the [Java API docs](api/java/org/apache/spark/ml/clustering/LDA.html) for more details. + +{% include_example java/org/apache/spark/examples/ml/JavaLDAExample.java %} +</div> + +</div>
\ No newline at end of file diff --git a/docs/ml-guide.md b/docs/ml-guide.md index be18a05361..6f35b30c3d 100644 --- a/docs/ml-guide.md +++ b/docs/ml-guide.md @@ -40,6 +40,7 @@ Also, some algorithms have additional capabilities in the `spark.ml` API; e.g., provide class probabilities, and linear models provide model summaries. * [Feature extraction, transformation, and selection](ml-features.html) +* [Clustering](ml-clustering.html) * [Decision Trees for classification and regression](ml-decision-tree.html) * [Ensembles](ml-ensembles.html) * [Linear methods with elastic net regularization](ml-linear-methods.html) @@ -950,4 +951,4 @@ model.transform(test) {% endhighlight %} </div> -</div> +</div>
\ No newline at end of file diff --git a/docs/mllib-guide.md b/docs/mllib-guide.md index 91e50ccfec..54e35fcbb1 100644 --- a/docs/mllib-guide.md +++ b/docs/mllib-guide.md @@ -69,6 +69,7 @@ We list major functionality from both below, with links to detailed guides. concepts. It also contains sections on using algorithms within the Pipelines API, for example: * [Feature extraction, transformation, and selection](ml-features.html) +* [Clustering](ml-clustering.html) * [Decision trees for classification and regression](ml-decision-tree.html) * [Ensembles](ml-ensembles.html) * [Linear methods with elastic net regularization](ml-linear-methods.html) |