diff options
author | Xiangrui Meng <meng@databricks.com> | 2015-07-30 09:45:17 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-07-30 09:45:17 -0700 |
commit | 81464f2a8243c6ae2a39bac7ebdc50d4f60af451 (patch) | |
tree | 633755569bd94d0b2d39b2953872eb3bf19362cc /python | |
parent | ed3cb1d21c73645c8f6e6ee08181f876fc192e41 (diff) | |
download | spark-81464f2a8243c6ae2a39bac7ebdc50d4f60af451.tar.gz spark-81464f2a8243c6ae2a39bac7ebdc50d4f60af451.tar.bz2 spark-81464f2a8243c6ae2a39bac7ebdc50d4f60af451.zip |
[MINOR] [MLLIB] fix doc for RegexTokenizer
This is #7791 for Python. hhbyyh
Author: Xiangrui Meng <meng@databricks.com>
Closes #7798 from mengxr/regex-tok-py and squashes the following commits:
baa2dcd [Xiangrui Meng] fix doc for RegexTokenizer
Diffstat (limited to 'python')
-rw-r--r-- | python/pyspark/ml/feature.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/python/pyspark/ml/feature.py b/python/pyspark/ml/feature.py index 86e654dd07..015e7a9d49 100644 --- a/python/pyspark/ml/feature.py +++ b/python/pyspark/ml/feature.py @@ -525,7 +525,7 @@ class RegexTokenizer(JavaTransformer, HasInputCol, HasOutputCol): """ A regex based tokenizer that extracts tokens either by using the provided regex pattern (in Java dialect) to split the text - (default) or repeatedly matching the regex (if gaps is true). + (default) or repeatedly matching the regex (if gaps is false). Optional parameters also allow filtering tokens using a minimal length. It returns an array of strings that can be empty. |