diff options
author | Burak Köse <burakks41@gmail.com> | 2016-05-06 13:58:12 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2016-05-06 13:58:12 -0700 |
commit | e20cd9f4ce977739ce80a2c39f8ebae5e53f72f6 (patch) | |
tree | ea5578c886cae4b083ca2ad6bdd9ca2008fa2bf9 /mllib/src/main/resources/org/apache/spark/ml/feature/stopwords/swedish.txt | |
parent | 5c8fad7b9bfd6677111a8e27e2574f82b04ec479 (diff) | |
download | spark-e20cd9f4ce977739ce80a2c39f8ebae5e53f72f6.tar.gz spark-e20cd9f4ce977739ce80a2c39f8ebae5e53f72f6.tar.bz2 spark-e20cd9f4ce977739ce80a2c39f8ebae5e53f72f6.zip |
[SPARK-14050][ML] Add multiple languages support and additional methods for Stop Words Remover
## What changes were proposed in this pull request?
This PR continues the work from #11871 with the following changes:
* load English stopwords as default
* covert stopwords to list in Python
* update some tests and doc
## How was this patch tested?
Unit tests.
Closes #11871
cc: burakkose srowen
Author: Burak Köse <burakks41@gmail.com>
Author: Xiangrui Meng <meng@databricks.com>
Author: Burak KOSE <burakks41@gmail.com>
Closes #12843 from mengxr/SPARK-14050.
Diffstat (limited to 'mllib/src/main/resources/org/apache/spark/ml/feature/stopwords/swedish.txt')
-rw-r--r-- | mllib/src/main/resources/org/apache/spark/ml/feature/stopwords/swedish.txt | 114 |
1 files changed, 114 insertions, 0 deletions
diff --git a/mllib/src/main/resources/org/apache/spark/ml/feature/stopwords/swedish.txt b/mllib/src/main/resources/org/apache/spark/ml/feature/stopwords/swedish.txt new file mode 100644 index 0000000000..9fae31c185 --- /dev/null +++ b/mllib/src/main/resources/org/apache/spark/ml/feature/stopwords/swedish.txt @@ -0,0 +1,114 @@ +och +det +att +i +en +jag +hon +som +han +på +den +med +var +sig +för +så +till +är +men +ett +om +hade +de +av +icke +mig +du +henne +då +sin +nu +har +inte +hans +honom +skulle +hennes +där +min +man +ej +vid +kunde +något +från +ut +när +efter +upp +vi +dem +vara +vad +över +än +dig +kan +sina +här +ha +mot +alla +under +någon +eller +allt +mycket +sedan +ju +denna +själv +detta +åt +utan +varit +hur +ingen +mitt +ni +bli +blev +oss +din +dessa +några +deras +blir +mina +samma +vilken +er +sådan +vår +blivit +dess +inom +mellan +sådant +varför +varje +vilka +ditt +vem +vilket +sitta +sådana +vart +dina +vars +vårt +våra +ert +era +vilkas
\ No newline at end of file |