-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stop words missing for en_core_web_md #922
Comments
This sounds like a bug in the model, thanks. The general-purpose answer is that flags like
This should be a good workaround for you until the model is updated. |
Btw could you run:
And paste the results here? Thanks, |
Info about spaCy
and Info about model en_core_web_md
|
Same here. Correct workaround is: nlp.vocab.add_flag(lambda s: s in spacy.en.word_sets.STOP_WORDS, spacy.attrs.IS_STOP) (function first, ID later). |
To include lower/upper/title -cased words (him/HIM/Him) I had to use: nlp.vocab.add_flag(lambda s: s.lower() in spacy.en.word_sets.STOP_WORDS, spacy.attrs.IS_STOP) |
The new |
@ines , I'm using |
Also having this problem with
|
Same problem but with |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
New to spaCy I want to configure stopwords.
The regular
spacy.en.STOP_WORDS
do not seem to apply when loading the bigger file ofen_core_web_md
How can I configure the big file to use the regular stop words?The text was updated successfully, but these errors were encountered: