[SPARK-18793][SPARK-18794][R] add spark.randomForest/spark.gbt to vignettes

## What changes were proposed in this pull request? Mention `spark.randomForest` and `spark.gbt` in vignettes. Keep the content minimal since users can type `?spark.randomForest` to see the full doc. cc: jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #16264 from mengxr/SPARK-18793.
author: Xiangrui Meng <meng@databricks.com> 2016-12-13 16:59:09 -0800
committer: Xiangrui Meng <meng@databricks.com> 2016-12-13 16:59:09 -0800
commit: 594b14f1ebd0b3db9f630e504be92228f11b4d9f (patch)
tree: 90217129249738bb03b3d824b4da2816f1c0b544 /R
parent: c68fb426d4ac05414fb402aa1f30f4c98df103ad (diff)
download: spark-594b14f1ebd0b3db9f630e504be92228f11b4d9f.tar.gz
spark-594b14f1ebd0b3db9f630e504be92228f11b4d9f.tar.bz2
spark-594b14f1ebd0b3db9f630e504be92228f11b4d9f.zip
1 files changed, 32 insertions, 0 deletions
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd b/R/pkg/vignettes/sparkr-vignettes.Rmd
index 625b759626..334daa51f0 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -449,6 +449,10 @@ SparkR supports the following machine learning models and algorithms.
 
 * Generalized Linear Model (GLM)
 
+* Random Forest
+
+* Gradient-Boosted Trees (GBT)
+
 * Naive Bayes Model
 
 * $k$-means Clustering
@@ -526,6 +530,34 @@ gaussianFitted <- predict(gaussianGLM, carsDF)
 head(select(gaussianFitted, "model", "prediction", "mpg", "wt", "hp"))
 ```
 
+#### Random Forest
+
+`spark.randomForest` fits a [random forest](https://en.wikipedia.org/wiki/Random_forest) classification or regression model on a `SparkDataFrame`.
+Users can call `summary` to get a summary of the fitted model, `predict` to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
+
+In the following example, we use the `longley` dataset to train a random forest and make predictions:
+
+```{r, warning=FALSE}
+df <- createDataFrame(longley)
+rfModel <- spark.randomForest(df, Employed ~ ., type = "regression", maxDepth = 2, numTrees = 2)
+summary(rfModel)
+predictions <- predict(rfModel, df)
+```
+
+#### Gradient-Boosted Trees
+
+`spark.gbt` fits a [gradient-boosted tree](https://en.wikipedia.org/wiki/Gradient_boosting) classification or regression model on a `SparkDataFrame`.
+Users can call `summary` to get a summary of the fitted model, `predict` to make predictions, and `write.ml`/`read.ml` to save/load fitted models.
+
+Similar to the random forest example above, we use the `longley` dataset to train a gradient-boosted tree and make predictions:
+
+```{r, warning=FALSE}
+df <- createDataFrame(longley)
+gbtModel <- spark.gbt(df, Employed ~ ., type = "regression", maxDepth = 2, maxIter = 2)
+summary(gbtModel)
+predictions <- predict(gbtModel, df)
+```
+
 #### Naive Bayes Model
 
 Naive Bayes model assumes independence among the features. `spark.naiveBayes` fits a [Bernoulli naive Bayes model](https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Bernoulli_naive_Bayes) against a SparkDataFrame. The data should be all categorical. These models are often used for document classification.
author	Xiangrui Meng <meng@databricks.com>	2016-12-13 16:59:09 -0800
committer	Xiangrui Meng <meng@databricks.com>	2016-12-13 16:59:09 -0800
commit	594b14f1ebd0b3db9f630e504be92228f11b4d9f (patch)
tree	90217129249738bb03b3d824b4da2816f1c0b544 /R
parent	c68fb426d4ac05414fb402aa1f30f4c98df103ad (diff)
download	spark-594b14f1ebd0b3db9f630e504be92228f11b4d9f.tar.gz spark-594b14f1ebd0b3db9f630e504be92228f11b4d9f.tar.bz2 spark-594b14f1ebd0b3db9f630e504be92228f11b4d9f.zip