-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Universal POS Tag scheme discrepancies #4485
Labels
Comments
svlandeg
added
enhancement
Feature requests and improvements
feat / tagger
Feature: Part-of-speech tagger
labels
Oct 21, 2019
adrianeboyd
added
bug
Bugs and behaviour differing from documentation
and removed
enhancement
Feature requests and improvements
labels
Oct 22, 2019
Thanks for bringing this up again! The tag maps should all be to UD v2 at this point, so this is a mistake. I'll verify the conversion tables based on the UD conversion info and update the documentation. |
3 tasks
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
Although I raised this issue ages ago (#593) and I know some work has been done on it, I just want to let you know there is still a discrepancy in the way spacy maps fine POS tags to coarse POS tags for English.
Specifically, the current spacy tag_map maps PRP$ and WP$ to PRON while Universal Dependencies map them to DET (UD tag map). These are words like "my car" and "whose car", which UD lists as examples of determiners (link).
I also noticed some old information on the annotation specifications page:
https://spacy.io/api/annotation#pos-tagging
Specifically, the header in the Universal Part-of-speech Tags tab says you use the Universal Dependencies scheme, while the headers in the English and German tabs say you use the Google Universal Tagset. The latter is no longer true (although maybe it is for German?)!
The text was updated successfully, but these errors were encountered: