Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify adding new component to existing model with CLI #4342

Closed
adrianeboyd opened this issue Sep 30, 2019 · 1 comment · Fixed by #4911
Closed

Simplify adding new component to existing model with CLI #4342

adrianeboyd opened this issue Sep 30, 2019 · 1 comment · Fixed by #4911
Labels
feat / cli Feature: Command-line interface training Training and updating models

Comments

@adrianeboyd
Copy link
Contributor

Feature description

It looks like these kinds of cases aren't easy to handle with the train CLI:

  1. I want to train a model from scratch given vectors, one corpus for the tagger, and one corpus for the parser. (See example in Multiple roots per sentence #4306.)

  2. I have a model with a tagger and want to add a parser trained on a separate corpus.

For the internal spacy models, it looks like each component is trained separately and then they are combined using custom scripts.

Could it make sense to have a CLI component that combines models/components (with compatibility checks, of course)?

Case (2) is really just a minor variant of (1), but the train CLI might be able to handle it relatively easily by combining the components in model-final, for example. I think case (1) could be handled by the train CLI in theory, but the command-line options would get too complicated and it would be easier to handle it with a separate CLI command.

(What is the right way to combine the vocab directories (aside from vectors) for multiple models?)

@lock
Copy link

lock bot commented Feb 15, 2020

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Feb 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / cli Feature: Command-line interface training Training and updating models
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants