Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix problems with lower and whitespace in variants #5361

Merged

Conversation

adrianeboyd
Copy link
Contributor

Description

  • Initialize lower flag explicitly

  • Handle whitespace words from GoldParse correctly when creating raw
    text with orth variants

  • Return the text with original casing if anything goes wrong

Types of change

Bug fix.

Checklist

  • I have submitted the spaCy Contributor Agreement.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

* Initialize lower flag explicitly

* Handle whitespace words from GoldParse correctly when creating raw
text with orth variants

* Return the text with original casing if anything goes wrong
@adrianeboyd adrianeboyd added bug Bugs and behaviour differing from documentation training Training and updating models feat / cli Feature: Command-line interface labels Apr 27, 2020
@adrianeboyd
Copy link
Contributor Author

Brought to you by Greek NER data with numerous non-breaking whitespace tokens.

@honnibal honnibal merged commit 74da669 into explosion:master Apr 29, 2020
adrianeboyd added a commit that referenced this pull request Jun 3, 2020
Port relevant changes from #5361:

* Initialize lower flag explicitly

* Handle whitespace words from GoldParse correctly when creating raw
text with orth variants
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation feat / cli Feature: Command-line interface training Training and updating models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants