Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducibility for TextCat and Tok2Vec #6218

Merged
merged 2 commits into from
Oct 7, 2020

Conversation

svlandeg
Copy link
Member

@svlandeg svlandeg commented Oct 7, 2020

Fixes #6177

Description

Training results could differ between runs, even with a fixed random seed, as the HashEmbed layer could be getting a different seed/key. Fixing them resolves this issue.

This is a backport of #5735 and intended as quick bugfix for the next 2.x release.

Types of change

bug fix

Checklist

  • I have submitted the spaCy Contributor Agreement.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@svlandeg svlandeg added bug Bugs and behaviour differing from documentation feat / training Feature: Training utils, Example, Corpus and converters training Training and updating models and removed feat / training Feature: Training utils, Example, Corpus and converters labels Oct 7, 2020
@honnibal honnibal merged commit 2998131 into explosion:master Oct 7, 2020
@svlandeg svlandeg deleted the bugfix/6177 branch October 8, 2020 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation training Training and updating models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

textcat model weights are not deterministic even with random.seed
2 participants