-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help with lemmatization, different results #3644
Comments
Can you check the POS-tags from such sentences from the input? Are the sentences are correctly tagged? |
Hi @DuyguA, thank you very much for your answer. It seems that the sentence is correctly tagged. I'm fine with any of those two results, I just want to be able to consistently "hit" the same result, either 'leaf' or 'leave'. (not sure if I made myself understandable). |
Looks like it's working for me. Thanks a lot! |
If this completely solved your issue, please close this topic so that we can focus our attention on open issues. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I'm currently using spaCy on Python. The model used is en-core-web-sm (2.1.0).
The following code is run to retrieve a list of words "cleansed" from a query
import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp(query)
list_words = []
for token in doc:
if token.text != ' ':
list_words.append(token.lemma_)
However I face a major issue, when running this code. For example, when the query is "processing of tea leaves". The result stored in list_words can be either ['processing', 'tea', 'leaf'] or ['processing', 'tea', 'leave'].
It seems that the result is not consistent. I cannot change my input/query (adding another word for context is not possible) and I really need to find the same result every time. I think the loading of the model may be the issue.
Why the result differ ? Can I load the model the "same" way everytime ? Did I miss a parameter to obtain the same result for ambiguous query ?
Thanks for your help
The text was updated successfully, but these errors were encountered: