Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

textcat model weights are not deterministic even with random.seed #6177

Closed
themrmax opened this issue Oct 1, 2020 · 3 comments · Fixed by #6218
Closed

textcat model weights are not deterministic even with random.seed #6177

themrmax opened this issue Oct 1, 2020 · 3 comments · Fixed by #6218
Labels
bug Bugs and behaviour differing from documentation duplicate Issues that have been reported before feat / textcat Feature: Text Classifier reproducibility Consistency, reproducibility, determinism, and randomness

Comments

@themrmax
Copy link

themrmax commented Oct 1, 2020

I'm trying to write some unit tests for my code and am struggling to make them deterministic. Consider the following example:

import spacy
import random

for _ in range(2):
    spacy.util.fix_random_seed(0)

    model = spacy.load('en_core_web_sm')

    model.add_pipe(model.create_pipe('textcat'))
    model.remove_pipe('parser')
    model.remove_pipe('tagger')

    cat = model.get_pipe('textcat')
    cat.add_label("dog")
    cat.add_label("donut")

    model.begin_training()
    print(model("What even is?").cats)

# output:
# {'dog': 0.527805507183075, 'donut': 0.8244330286979675}
# {'dog': 0.6659842729568481, 'donut': 0.19725579023361206}

Looks like maybe a duplicate/regression of #3182

Your Environment

  • Operating System: Mac OS 10.15.7
  • Python Version Used: 3.7.4
  • spaCy Version Used: 2.3.0
  • Environment Information: anaconda3
@adrianeboyd adrianeboyd added bug Bugs and behaviour differing from documentation duplicate Issues that have been reported before feat / textcat Feature: Text Classifier labels Oct 2, 2020
@adrianeboyd
Copy link
Contributor

Yes, this is a known (and frustrating!) issue, see #5551. I think this will be fixed in spacy v3. The relevant PR is #5735.

At first glance, I think this wouldn't be too hard to backport to spacy v2, but this hasn't been done yet.

@svlandeg
Copy link
Member

svlandeg commented Oct 7, 2020

PR #6218 should fix it for v2. When I run your code, I get:

{'dog': 0.5329195857048035, 'donut': 0.5502294301986694}
{'dog': 0.5329195857048035, 'donut': 0.5502294301986694}

@github-actions
Copy link
Contributor

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 31, 2021
@polm polm added the reproducibility Consistency, reproducibility, determinism, and randomness label Nov 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation duplicate Issues that have been reported before feat / textcat Feature: Text Classifier reproducibility Consistency, reproducibility, determinism, and randomness
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants