[SPARK-9977] [DOCS] Update documentation for StringIndexer

By using `StringIndexer`, we can obtain indexed label on new column. So a following estimator should use this new column through pipeline if it wants to use string indexed label. I think it is better to make it explicit on documentation. Author: lewuathe <lewuathe@me.com> Closes #8205 from Lewuathe/SPARK-9977.
author: lewuathe <lewuathe@me.com> 2015-08-19 09:54:03 +0100
committer: Sean Owen <sowen@cloudera.com> 2015-08-19 09:54:03 +0100
commit: ba2a07e2b6c5a39597b64041cd5bf342ef9631f5 (patch)
tree: 6005346fcddd42bfebc5083ec55562427e8ab1b8 /docs/ml-features.md
parent: 865a3df3d578c0442c97d749c81f554b560da406 (diff)
download: spark-ba2a07e2b6c5a39597b64041cd5bf342ef9631f5.tar.gz
spark-ba2a07e2b6c5a39597b64041cd5bf342ef9631f5.tar.bz2
spark-ba2a07e2b6c5a39597b64041cd5bf342ef9631f5.zip
1 files changed, 5 insertions, 1 deletions
diff --git a/docs/ml-features.md b/docs/ml-features.md
index d82c85ee75..8d56dc32ca 100644
--- a/docs/ml-features.md
+++ b/docs/ml-features.md
@@ -725,7 +725,11 @@ dctDf.select("featuresDCT").show(3);
 `StringIndexer` encodes a string column of labels to a column of label indices.
 The indices are in `[0, numLabels)`, ordered by label frequencies.
 So the most frequent label gets index `0`.
-If the input column is numeric, we cast it to string and index the string values.
+If the input column is numeric, we cast it to string and index the string 
+values. When downstream pipeline components such as `Estimator` or 
+`Transformer` make use of this string-indexed label, you must set the input 
+column of the component to this string-indexed column name. In many cases, 
+you can set the input column with `setInputCol`.
 
 **Examples**
author	lewuathe <lewuathe@me.com>	2015-08-19 09:54:03 +0100
committer	Sean Owen <sowen@cloudera.com>	2015-08-19 09:54:03 +0100
commit	ba2a07e2b6c5a39597b64041cd5bf342ef9631f5 (patch)
tree	6005346fcddd42bfebc5083ec55562427e8ab1b8 /docs/ml-features.md
parent	865a3df3d578c0442c97d749c81f554b560da406 (diff)
download	spark-ba2a07e2b6c5a39597b64041cd5bf342ef9631f5.tar.gz spark-ba2a07e2b6c5a39597b64041cd5bf342ef9631f5.tar.bz2 spark-ba2a07e2b6c5a39597b64041cd5bf342ef9631f5.zip