[SPARK-10595] [ML] [MLLIB] [DOCS] Various ML guide cleanups

Various ML guide cleanups. * ml-guide.md: Make it easier to access the algorithm-specific guides. * LDA user guide: EM often begins with useless topics, but running longer generally improves them dramatically. E.g., 10 iterations on a Wikipedia dataset produces useless topics, but 50 iterations produces very meaningful topics. * mllib-feature-extraction.html#elementwiseproduct: “w” parameter should be “scalingVec” * Clean up Binarizer user guide a little. * Document in Pipeline that users should not put an instance into the Pipeline in more than 1 place. * spark.ml Word2Vec user guide: clean up grammar/writing * Chi Sq Feature Selector docs: Improve text in doc. CC: mengxr feynmanliang Author: Joseph K. Bradley <joseph@databricks.com> Closes #8752 from jkbradley/mlguide-fixes-1.5.
author: Joseph K. Bradley <joseph@databricks.com> 2015-09-15 19:43:26 -0700
committer: Xiangrui Meng <meng@databricks.com> 2015-09-15 19:43:26 -0700
commit: b921fe4dc0442aa133ab7d55fba24bc798d59aa2 (patch)
tree: 5a545ee45ab39f6caad096049818564914635334 /docs/mllib-guide.md
parent: 64c29afcb787d9f176a197c25314295108ba0471 (diff)
download: spark-b921fe4dc0442aa133ab7d55fba24bc798d59aa2.tar.gz
spark-b921fe4dc0442aa133ab7d55fba24bc798d59aa2.tar.bz2
spark-b921fe4dc0442aa133ab7d55fba24bc798d59aa2.zip
1 files changed, 2 insertions, 2 deletions
diff --git a/docs/mllib-guide.md b/docs/mllib-guide.md
index 257f7cc760..91e50ccfec 100644
--- a/docs/mllib-guide.md
+++ b/docs/mllib-guide.md
@@ -13,9 +13,9 @@ primitives and higher-level pipeline APIs.
 
 It divides into two packages:
 
-* [`spark.mllib`](mllib-guide.html#mllib-types-algorithms-and-utilities) contains the original API
+* [`spark.mllib`](mllib-guide.html#data-types-algorithms-and-utilities) contains the original API
   built on top of [RDDs](programming-guide.html#resilient-distributed-datasets-rdds).
-* [`spark.ml`](mllib-guide.html#sparkml-high-level-apis-for-ml-pipelines) provides higher-level API
+* [`spark.ml`](ml-guide.html) provides higher-level API
   built on top of [DataFrames](sql-programming-guide.html#dataframes) for constructing ML pipelines.
 
 Using `spark.ml` is recommended because with DataFrames the API is more versatile and flexible.
author	Joseph K. Bradley <joseph@databricks.com>	2015-09-15 19:43:26 -0700
committer	Xiangrui Meng <meng@databricks.com>	2015-09-15 19:43:26 -0700
commit	b921fe4dc0442aa133ab7d55fba24bc798d59aa2 (patch)
tree	5a545ee45ab39f6caad096049818564914635334 /docs/mllib-guide.md
parent	64c29afcb787d9f176a197c25314295108ba0471 (diff)
download	spark-b921fe4dc0442aa133ab7d55fba24bc798d59aa2.tar.gz spark-b921fe4dc0442aa133ab7d55fba24bc798d59aa2.tar.bz2 spark-b921fe4dc0442aa133ab7d55fba24bc798d59aa2.zip