blob: ec08a5080774d740cf14267444cd13c7e9e0880b (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
|
Stopwords Corpus
This corpus contains lists of stop words for several languages. These
are high-frequency grammatical words which are usually ignored in text
retrieval applications.
They were obtained from:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/snowball/stopwords/
The English list has been augmented
https://github.com/nltk/nltk_data/issues/22
|