-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements on setting attributes in merge_noun_chunks function #4107
Comments
I think those both sound like reasonable defaults, so we should probably consider just adding them to the built-in function 🙂 Feel free to submit a PR btw. We probably want to avoid introducing settings to the built-in component functions – it adds too much complexity for what they are (small wrappers around Btw, also in case others come across this issue later: the spaCy/spacy/pipeline/functions.py Lines 7 to 21 in 250a544
|
Resolved by #4219! |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Feature description
It would be nice if during the merging of noun chunks the lemma would also be properly set. Currently, the lemma of the merged token is the lemma of the first word in the noun chunk sequence, which I don't think is the desired behavior in most cases.
Additionally, it would be nice to have more flexibility in setting the attributes of the merged token. For example, I would like to keep the entity type of the root token as the entity type of the whole merged token.
The text was updated successfully, but these errors were encountered: