-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to integrate util.filter_spans in nlp.pipe() ? - ValueError: [E102] Can't merge non-disjoint spans. #5393
Comments
Hmm, I actually suspect something is going wrong with
Also, the output of |
Hi Adriane, thanks for your response. I filtered out a few sentences and errors like this:
It found 12 errors in 73327 texts. Here are the texts that caused the errors:
Here are the respective errors:
Regarding
I suppose it's taking the model from outside the conda env, so I ran
Hope this helps :) |
Thanks for the examples! I can replicate this with v2.2.3 and v2.2.4, but not with |
An additional text from #5458:
|
thanks for fixing this :) @honnibal @svlandeg @adrianeboyd |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
2 similar comments
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hi, I've added nlp.create_pipe("merge_noun_chunks") to my nlp pipeline as described here: https://spacy.io/api/pipeline-functions.
When I run the nlp pipeline on large amounts of text, I sometimes get the following error. (for some corpuses I get the error, for others I dont - probably depending on random sentences).
I saw in other issues (e.g. #3687), that this can be solved with the util.filter_spans function, but I don't understand how to integrate this helper function in an nlp.pipe pipeline.
Thanks for your advice :)
How to reproduce the behaviour
(Unfortunately I can't give you a specific string or context_tpl_lst object, because I don't know which sentence in my corpus is causing the error)
Your Environment
The text was updated successfully, but these errors were encountered: