-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parser not obeying is_sent_start == False (regression) #2772
Comments
Thanks, I've been chasing this bug for a while on develop. I think it's occurring in |
The issue arises when we have non-projective dependencies (aka crossing brackets). The parser is constrained to produce only projective trees, but there's a pre- and post- processing trick to make the parser predict non-projective analyses. After deprojectivisation, we run the |
The set_children_from_heads function assumed parse trees were projective. However, non-projective parses may be passed in during deserialization, or after deprojectivising. This caused incorrect sentence boundaries to be set for non-projective parses. Close #2772.
Thanks again for the test case. Fixed now! This had held up the experiments on the universal dependencies corpus, as there are many more non-projective parses there. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Not only is spacy nightly not obeying is_sent_start it also is producing a bad sentence segmentation.
How to reproduce the behaviour
I expected
1 When we write or communicate virtually, we can hide our through feelings and many not become ourselves since we do not want the other party to judge us.
1 When we write or communicate virtually, we can hide our through feelings and many not become ourselves since we do not want the other party to judge us.
But get this instead
1 When
2 we write or communicate virtually, we can hide our through feelings and many not become ourselves since we do not want the other party to judge us
1 When
2 we write or communicate virtually, we can hide our through feelings and many not become ourselves since we do not want the other party to judge us
When I run this on the binder (https://spacy.io/usage/processing-pipelines#component-example1) it works as expected.
Your Environment
The text was updated successfully, but these errors were encountered: