aboutsummaryrefslogtreecommitdiff
path: root/docs/ml-features.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/ml-features.md')
-rw-r--r--docs/ml-features.md6
1 files changed, 5 insertions, 1 deletions
diff --git a/docs/ml-features.md b/docs/ml-features.md
index d82c85ee75..8d56dc32ca 100644
--- a/docs/ml-features.md
+++ b/docs/ml-features.md
@@ -725,7 +725,11 @@ dctDf.select("featuresDCT").show(3);
`StringIndexer` encodes a string column of labels to a column of label indices.
The indices are in `[0, numLabels)`, ordered by label frequencies.
So the most frequent label gets index `0`.
-If the input column is numeric, we cast it to string and index the string values.
+If the input column is numeric, we cast it to string and index the string
+values. When downstream pipeline components such as `Estimator` or
+`Transformer` make use of this string-indexed label, you must set the input
+column of the component to this string-indexed column name. In many cases,
+you can set the input column with `setInputCol`.
**Examples**