[SPARK-1506][MLLIB] Documentation improvements for MLlib 1.0

Preview: http://54.82.240.23:4000/mllib-guide.html Table of contents: * Basics * Data types * Summary statistics * Classification and regression * linear support vector machine (SVM) * logistic regression * linear linear squares, Lasso, and ridge regression * decision tree * naive Bayes * Collaborative Filtering * alternating least squares (ALS) * Clustering * k-means * Dimensionality reduction * singular value decomposition (SVD) * principal component analysis (PCA) * Optimization * stochastic gradient descent * limited-memory BFGS (L-BFGS) Author: Xiangrui Meng <meng@databricks.com> Closes #422 from mengxr/mllib-doc and squashes the following commits: 944e3a9 [Xiangrui Meng] merge master f9fda28 [Xiangrui Meng] minor 9474065 [Xiangrui Meng] add alpha to ALS examples 928e630 [Xiangrui Meng] initialization_mode -> initializationMode 5bbff49 [Xiangrui Meng] add imports to labeled point examples c17440d [Xiangrui Meng] fix python nb example 28f40dc [Xiangrui Meng] remove localhost:4000 369a4d3 [Xiangrui Meng] Merge branch 'master' into mllib-doc 7dc95cc [Xiangrui Meng] update linear methods 053ad8a [Xiangrui Meng] add links to go back to the main page abbbf7e [Xiangrui Meng] update ALS argument names 648283e [Xiangrui Meng] level down statistics 14e2287 [Xiangrui Meng] add sample libsvm data and use it in guide 8cd2441 [Xiangrui Meng] minor updates 186ab07 [Xiangrui Meng] update section names 6568d65 [Xiangrui Meng] update toc, level up lr and svm 162ee12 [Xiangrui Meng] rename section names 5c1e1b1 [Xiangrui Meng] minor 8aeaba1 [Xiangrui Meng] wrap long lines 6ce6a6f [Xiangrui Meng] add summary statistics to toc 5760045 [Xiangrui Meng] claim beta cc604bf [Xiangrui Meng] remove classification and regression 92747b3 [Xiangrui Meng] make section titles consistent e605dd6 [Xiangrui Meng] add LIBSVM loader f639674 [Xiangrui Meng] add python section to migration guide c82ffb4 [Xiangrui Meng] clean optimization 31660eb [Xiangrui Meng] update linear algebra and stat 0a40837 [Xiangrui Meng] first pass over linear methods 1fc8271 [Xiangrui Meng] update toc 906ed0a [Xiangrui Meng] add a python example to naive bayes 5f0a700 [Xiangrui Meng] update collaborative filtering 656d416 [Xiangrui Meng] update mllib-clustering 86e143a [Xiangrui Meng] remove data types section from main page 8d1a128 [Xiangrui Meng] move part of linear algebra to data types and add Java/Python examples d1b5cbf [Xiangrui Meng] merge master 72e4804 [Xiangrui Meng] one pass over tree guide 64f8995 [Xiangrui Meng] move decision tree guide to a separate file 9fca001 [Xiangrui Meng] add first version of linear algebra guide 53c9552 [Xiangrui Meng] update dependencies f316ec2 [Xiangrui Meng] add migration guide f399f6c [Xiangrui Meng] move linear-algebra to dimensionality-reduction 182460f [Xiangrui Meng] add guide for naive Bayes 137fd1d [Xiangrui Meng] re-organize toc a61e434 [Xiangrui Meng] update mllib's toc
author: Xiangrui Meng <meng@databricks.com> 2014-04-22 11:20:47 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-04-22 11:20:47 -0700
commit: 26d35f3fd942761b0adecd1a720e1fa834db4de9 (patch)
tree: 16e57e2ff01e7cd2d7a1a3c1f3bf98c9cf98a082 /docs/mllib-optimization.md
parent: bf9d49b6d1f668b49795c2d380ab7d64ec0029da (diff)
download: spark-26d35f3fd942761b0adecd1a720e1fa834db4de9.tar.gz
spark-26d35f3fd942761b0adecd1a720e1fa834db4de9.tar.bz2
spark-26d35f3fd942761b0adecd1a720e1fa834db4de9.zip
1 files changed, 12 insertions, 13 deletions
diff --git a/docs/mllib-optimization.md b/docs/mllib-optimization.md
index c79cc3d944..bec3912b55 100644
--- a/docs/mllib-optimization.md
+++ b/docs/mllib-optimization.md
@@ -1,6 +1,6 @@
 ---
 layout: global
-title: MLlib - Optimization
+title: <a href="mllib-guide.html">MLlib</a> - Optimization
 ---
 
 * Table of contents
@@ -25,9 +25,10 @@ title: MLlib - Optimization
 
 
 
-# Mathematical Description
+## Mathematical description
+
+### Gradient descent
 
-## (Sub)Gradient Descent
 The simplest method to solve optimization problems of the form `$\min_{\wv \in\R^d} \; f(\wv)$`
 is [gradient descent](http://en.wikipedia.org/wiki/Gradient_descent).
 Such first-order optimization methods (including gradient descent and stochastic variants
@@ -38,14 +39,14 @@ the direction of steepest descent, which is the negative of the derivative (call
 [gradient](http://en.wikipedia.org/wiki/Gradient)) of the function at the current point, i.e., at
 the current parameter value.
 If the objective function `$f$` is not differentiable at all arguments, but still convex, then a
-*subgradient* 
+*sub-gradient* 
 is the natural generalization of the gradient, and assumes the role of the step direction.
-In any case, computing a gradient or subgradient of `$f$` is expensive --- it requires a full
+In any case, computing a gradient or sub-gradient of `$f$` is expensive --- it requires a full
 pass through the complete dataset, in order to compute the contributions from all loss terms.
 
-## Stochastic (Sub)Gradient Descent (SGD)
+### Stochastic gradient descent (SGD)
 Optimization problems whose objective function `$f$` is written as a sum are particularly
-suitable to be solved using *stochastic subgradient descent (SGD)*. 
+suitable to be solved using *stochastic gradient descent (SGD)*. 
 In our case, for the optimization formulations commonly used in <a
 href="mllib-classification-regression.html">supervised machine learning</a>,
 `\begin{equation}
@@ -98,7 +99,7 @@ For the L1-regularizer, the proximal operator is given by soft thresholding, as
 [L1Updater](api/scala/index.html#org.apache.spark.mllib.optimization.L1Updater).
 
 
-## Update Schemes for Distributed SGD
+### Update schemes for distributed SGD
 The SGD implementation in
 [GradientDescent](api/scala/index.html#org.apache.spark.mllib.optimization.GradientDescent) uses
 a simple (distributed) sampling of the data examples.
@@ -129,12 +130,12 @@ point.
 
 
 
-# Implementation in MLlib
+## Implementation in MLlib
 
 Gradient descent methods including stochastic subgradient descent (SGD) as
 included as a low-level primitive in `MLlib`, upon which various ML algorithms 
 are developed, see the 
-<a href="mllib-classification-regression.html">classification and regression</a> 
+<a href="mllib-linear-methods.html">linear methods</a> 
 section for example.
 
 The SGD method
@@ -161,6 +162,4 @@ each iteration, to compute the gradient direction.
 
 Available algorithms for gradient descent:
 
-* [GradientDescent.runMiniBatchSGD](api/scala/index.html#org.apache.spark.mllib.optimization.GradientDescent)
-
-
+* [GradientDescent.runMiniBatchSGD](api/mllib/index.html#org.apache.spark.mllib.optimization.GradientDescent)
author	Xiangrui Meng <meng@databricks.com>	2014-04-22 11:20:47 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-04-22 11:20:47 -0700
commit	26d35f3fd942761b0adecd1a720e1fa834db4de9 (patch)
tree	16e57e2ff01e7cd2d7a1a3c1f3bf98c9cf98a082 /docs/mllib-optimization.md
parent	bf9d49b6d1f668b49795c2d380ab7d64ec0029da (diff)
download	spark-26d35f3fd942761b0adecd1a720e1fa834db4de9.tar.gz spark-26d35f3fd942761b0adecd1a720e1fa834db4de9.tar.bz2 spark-26d35f3fd942761b0adecd1a720e1fa834db4de9.zip