Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💫 Merge conllu converters into one function #3373

Closed
ines opened this issue Mar 8, 2019 · 1 comment
Closed

💫 Merge conllu converters into one function #3373

ines opened this issue Mar 8, 2019 · 1 comment
Labels
enhancement Feature requests and improvements feat / cli Feature: Command-line interface help wanted (easy) Contributions welcome! (also suited for spaCy beginners) help wanted Contributions welcome!

Comments

@ines
Copy link
Member

ines commented Mar 8, 2019

I was just updating the spacy convert command and docs and noticed that the converters are currently a bit messy. We might be able to eliminate some code duplication and make them a bit nicer overall. It'd be nice if we could merge the conll converters (or at least conll/conllu and conllubio) into a single script.

(That said, we do have to accept that they'll always be kinda hacky, simply because the formats we're dealing with aren't always 100% consistent. There are several variations of the .conll format alone that we need to handle).

Resources

FYI: Future plans

  1. Change converters to create and return Doc objects instead of JSON objects. This means that the new Doc.to_json object will be the single source of truth for the JSON format and we don't end up with arbitrary data transformation logic all over the place.
  2. Update the training data format to a more straightforward JSONL format. See 💫 Proposal: New JSON(L) format for training and improved training commands #2928.
@ines ines added enhancement Feature requests and improvements help wanted Contributions welcome! help wanted (easy) Contributions welcome! (also suited for spaCy beginners) feat / cli Feature: Command-line interface labels Mar 8, 2019
@ines ines closed this as completed Mar 15, 2019
@lock
Copy link

lock bot commented Apr 14, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Apr 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Feature requests and improvements feat / cli Feature: Command-line interface help wanted (easy) Contributions welcome! (also suited for spaCy beginners) help wanted Contributions welcome!
Projects
None yet
Development

No branches or pull requests

1 participant