aboutsummaryrefslogtreecommitdiff
path: root/docs/mllib-guide.md
diff options
context:
space:
mode:
authorXiangrui Meng <meng@databricks.com>2015-08-29 23:26:23 -0700
committerXiangrui Meng <meng@databricks.com>2015-08-29 23:26:23 -0700
commit905fbe498bdd29116468628e6a2a553c1fd57165 (patch)
tree9a1859c344848b7427ee2113114e78a884c083b8 /docs/mllib-guide.md
parent13f5f8ec97c6886346641b73bd99004e0d70836c (diff)
downloadspark-905fbe498bdd29116468628e6a2a553c1fd57165.tar.gz
spark-905fbe498bdd29116468628e6a2a553c1fd57165.tar.bz2
spark-905fbe498bdd29116468628e6a2a553c1fd57165.zip
[SPARK-10348] [MLLIB] updates ml-guide
* replace `ML Dataset` by `DataFrame` to unify the abstraction * ML algorithms -> pipeline components to describe the main concept * remove Scala API doc links from the main guide * `Section Title` -> `Section tile` to be consistent with other section titles in MLlib guide * modified lines break at 100 chars or periods jkbradley feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8517 from mengxr/SPARK-10348.
Diffstat (limited to 'docs/mllib-guide.md')
-rw-r--r--docs/mllib-guide.md12
1 files changed, 6 insertions, 6 deletions
diff --git a/docs/mllib-guide.md b/docs/mllib-guide.md
index 876dcfd40e..257f7cc760 100644
--- a/docs/mllib-guide.md
+++ b/docs/mllib-guide.md
@@ -14,9 +14,9 @@ primitives and higher-level pipeline APIs.
It divides into two packages:
* [`spark.mllib`](mllib-guide.html#mllib-types-algorithms-and-utilities) contains the original API
- built on top of RDDs.
+ built on top of [RDDs](programming-guide.html#resilient-distributed-datasets-rdds).
* [`spark.ml`](mllib-guide.html#sparkml-high-level-apis-for-ml-pipelines) provides higher-level API
- built on top of DataFrames for constructing ML pipelines.
+ built on top of [DataFrames](sql-programming-guide.html#dataframes) for constructing ML pipelines.
Using `spark.ml` is recommended because with DataFrames the API is more versatile and flexible.
But we will keep supporting `spark.mllib` along with the development of `spark.ml`.
@@ -57,19 +57,19 @@ We list major functionality from both below, with links to detailed guides.
* [FP-growth](mllib-frequent-pattern-mining.html#fp-growth)
* [association rules](mllib-frequent-pattern-mining.html#association-rules)
* [PrefixSpan](mllib-frequent-pattern-mining.html#prefix-span)
-* [Evaluation Metrics](mllib-evaluation-metrics.html)
+* [Evaluation metrics](mllib-evaluation-metrics.html)
+* [PMML model export](mllib-pmml-model-export.html)
* [Optimization (developer)](mllib-optimization.html)
* [stochastic gradient descent](mllib-optimization.html#stochastic-gradient-descent-sgd)
* [limited-memory BFGS (L-BFGS)](mllib-optimization.html#limited-memory-bfgs-l-bfgs)
-* [PMML model export](mllib-pmml-model-export.html)
# spark.ml: high-level APIs for ML pipelines
**[spark.ml programming guide](ml-guide.html)** provides an overview of the Pipelines API and major
concepts. It also contains sections on using algorithms within the Pipelines API, for example:
-* [Feature Extraction, Transformation, and Selection](ml-features.html)
-* [Decision Trees for Classification and Regression](ml-decision-tree.html)
+* [Feature extraction, transformation, and selection](ml-features.html)
+* [Decision trees for classification and regression](ml-decision-tree.html)
* [Ensembles](ml-ensembles.html)
* [Linear methods with elastic net regularization](ml-linear-methods.html)
* [Multilayer perceptron classifier](ml-ann.html)