-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PhraseMatcher.remove throws an error if there are duplicates in the list of phrases #4435
Comments
Thanks for the report and the code example! Can confirm the erratic behaviour - this certainly looks like a bug. We'll look into it. |
My pleasure. I'm glad you are looking into it. I'm trying to workaround it, but end up manually making a second list of the purged phrases. This will purge the duplicates...
|
I'm not sure there's a good workaround even if you filter the phrases a bit, since there were a couple things wrong with how it removed multiple phrases for the same match ID. The only (still rather hacky) workaround that I think could work would be to add each phrase with a separate match ID (like "ANIMAL_sheep" or "ANIMAL_dog") and filter/reduce the match IDs after matching. Thanks again for the bug report! Hopefully this patch will fix the problem for the next minor release. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
How to reproduce the behaviour
Removing phrases from the PhraseMatcher causes an error when one of the phrases exists as part of another phrase. Sometimes, with long lists of phrases, it causes a segmentation fault. This snippet might reproduce the error (it does for me):
will throw this error:
Your Environment
The text was updated successfully, but these errors were encountered: