aboutsummaryrefslogtreecommitdiff
path: root/docs/mllib-decision-tree.md
diff options
context:
space:
mode:
authorTimothy Hunter <timhunter@databricks.com>2015-12-10 12:50:46 -0800
committerJoseph K. Bradley <joseph@databricks.com>2015-12-10 12:50:46 -0800
commit2ecbe02d5b28ee562d10c1735244b90a08532c9e (patch)
treec589a01a2900513aa1b277303ed7cdffc1961ba4 /docs/mllib-decision-tree.md
parentec5f9ed5de2218938dba52152475daafd4dc4786 (diff)
downloadspark-2ecbe02d5b28ee562d10c1735244b90a08532c9e.tar.gz
spark-2ecbe02d5b28ee562d10c1735244b90a08532c9e.tar.bz2
spark-2ecbe02d5b28ee562d10c1735244b90a08532c9e.zip
[SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.
Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark). It also removes some files that I forgot to delete with #10207 Author: Timothy Hunter <timhunter@databricks.com> Closes #10234 from thunterdb/12212.
Diffstat (limited to 'docs/mllib-decision-tree.md')
-rw-r--r--docs/mllib-decision-tree.md6
1 files changed, 3 insertions, 3 deletions
diff --git a/docs/mllib-decision-tree.md b/docs/mllib-decision-tree.md
index 77ce34e91a..a8612b6c84 100644
--- a/docs/mllib-decision-tree.md
+++ b/docs/mllib-decision-tree.md
@@ -1,7 +1,7 @@
---
layout: global
-title: Decision Trees - MLlib
-displayTitle: <a href="mllib-guide.html">MLlib</a> - Decision Trees
+title: Decision Trees - spark.mllib
+displayTitle: Decision Trees - spark.mllib
---
* Table of contents
@@ -15,7 +15,7 @@ feature scaling, and are able to capture non-linearities and feature interaction
algorithms such as random forests and boosting are among the top performers for classification and
regression tasks.
-MLlib supports decision trees for binary and multiclass classification and for regression,
+`spark.mllib` supports decision trees for binary and multiclass classification and for regression,
using both continuous and categorical features. The implementation partitions data by rows,
allowing distributed training with millions of instances.