From 905fbe498bdd29116468628e6a2a553c1fd57165 Mon Sep 17 00:00:00 2001 From: Xiangrui Meng Date: Sat, 29 Aug 2015 23:26:23 -0700 Subject: [SPARK-10348] [MLLIB] updates ml-guide * replace `ML Dataset` by `DataFrame` to unify the abstraction * ML algorithms -> pipeline components to describe the main concept * remove Scala API doc links from the main guide * `Section Title` -> `Section tile` to be consistent with other section titles in MLlib guide * modified lines break at 100 chars or periods jkbradley feynmanliang Author: Xiangrui Meng Closes #8517 from mengxr/SPARK-10348. --- docs/mllib-guide.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'docs/mllib-guide.md') diff --git a/docs/mllib-guide.md b/docs/mllib-guide.md index 876dcfd40e..257f7cc760 100644 --- a/docs/mllib-guide.md +++ b/docs/mllib-guide.md @@ -14,9 +14,9 @@ primitives and higher-level pipeline APIs. It divides into two packages: * [`spark.mllib`](mllib-guide.html#mllib-types-algorithms-and-utilities) contains the original API - built on top of RDDs. + built on top of [RDDs](programming-guide.html#resilient-distributed-datasets-rdds). * [`spark.ml`](mllib-guide.html#sparkml-high-level-apis-for-ml-pipelines) provides higher-level API - built on top of DataFrames for constructing ML pipelines. + built on top of [DataFrames](sql-programming-guide.html#dataframes) for constructing ML pipelines. Using `spark.ml` is recommended because with DataFrames the API is more versatile and flexible. But we will keep supporting `spark.mllib` along with the development of `spark.ml`. @@ -57,19 +57,19 @@ We list major functionality from both below, with links to detailed guides. * [FP-growth](mllib-frequent-pattern-mining.html#fp-growth) * [association rules](mllib-frequent-pattern-mining.html#association-rules) * [PrefixSpan](mllib-frequent-pattern-mining.html#prefix-span) -* [Evaluation Metrics](mllib-evaluation-metrics.html) +* [Evaluation metrics](mllib-evaluation-metrics.html) +* [PMML model export](mllib-pmml-model-export.html) * [Optimization (developer)](mllib-optimization.html) * [stochastic gradient descent](mllib-optimization.html#stochastic-gradient-descent-sgd) * [limited-memory BFGS (L-BFGS)](mllib-optimization.html#limited-memory-bfgs-l-bfgs) -* [PMML model export](mllib-pmml-model-export.html) # spark.ml: high-level APIs for ML pipelines **[spark.ml programming guide](ml-guide.html)** provides an overview of the Pipelines API and major concepts. It also contains sections on using algorithms within the Pipelines API, for example: -* [Feature Extraction, Transformation, and Selection](ml-features.html) -* [Decision Trees for Classification and Regression](ml-decision-tree.html) +* [Feature extraction, transformation, and selection](ml-features.html) +* [Decision trees for classification and regression](ml-decision-tree.html) * [Ensembles](ml-ensembles.html) * [Linear methods with elastic net regularization](ml-linear-methods.html) * [Multilayer perceptron classifier](ml-ann.html) -- cgit v1.2.3