textcat training is not deterministic with gpu enabled #6373

wlwg · 2020-11-10T22:33:57Z

How to reproduce the behaviour

This is related to #6177. I can verify that when using CPU, the training losses/weights for textcat can be deterministic with fix_random_seed. However, if I enable GPU via spacy.require_gpu(), the training losses/weights become different every time.

import spacy
spacy.require_gpu()

for _ in range(2):
    spacy.util.fix_random_seed(0)

    model = spacy.load('en_core_web_sm')

    model.add_pipe(model.create_pipe('textcat'))
    model.remove_pipe('parser')
    model.remove_pipe('tagger')

    cat = model.get_pipe('textcat')
    cat.add_label("dog")
    cat.add_label("donut")

    model.begin_training()
    print(model("What even is?").cats)

Output:

{'dog': 0.2501096725463867, 'donut': 0.3427947163581848}
{'dog': 0.9567031860351562, 'donut': 0.9506585001945496}

Your Environment

Operating System: Linux
Python Version Used: 3.6.9
spaCy Version Used: latest on master (git sha: 320a8b1)
Environment Information: Google Colab

The text was updated successfully, but these errors were encountered:

adrianeboyd · 2020-11-11T09:41:13Z

Hmm, I can't reproduce this.

Can you double-check by explicitly uninstalling spacy in colab before installing from master? It's possible that the default spacy install isn't being replaced/uninstalled cleanly when you install from source.

What do you see in spacy.git_info.GIT_VERSION?

svlandeg · 2020-11-11T16:29:25Z

And what is your thinc version?

wlwg · 2020-11-13T20:33:55Z

@adrianeboyd @svlandeg
spacy.__version__: 2.3.2
spacy.git_info.GIT_VERSION: 320a8b148
thinc: 7.4.1

I just wrote up a more detailed script: https://colab.research.google.com/drive/1lVJpVE-SS85jQP3LdkuZkhKvpBA0EuXM?usp=sharing

adrianeboyd · 2020-11-16T16:17:14Z

Hmm, I do think there may be a bug of some sort here in spacy v2. Locally and with the colab example above I get consistent results within multiple CPU and GPU runs (also with our quick internal test cases related to this), but the CPU and GPU results are not similar to each other, and if I extend the training a bit I do get different results for multiple GPU runs. We will look into it!

In better news, with spacy v3 I get the same results on both (minus some float rounding differences, of course).

svlandeg · 2020-11-18T12:40:09Z

I'd be happy to look into this further, but I can't reproduce... :(

If I run this on either CPU or GPU, I just keep getting consistent results, after installing a clean copy of spacy[cuda101].
I can run the training loop 200 times, just keep getting the same result.

The only thing I can think of right now, that this happens on Linux and not Windows? Though that makes little sense to me. @adrianeboyd : you couldn't replicate at first either - what exactly did you change to replicate this?

adrianeboyd · 2020-11-19T13:26:54Z

Here's my test script (just adapted a bit from the one in the colab example):

import spacy
from spacy.util import minibatch, compounding

def train():
    spacy.util.fix_random_seed(0)
    model = spacy.blank("en")

    model.add_pipe(model.create_pipe("textcat"))

    cat = model.get_pipe("textcat")
    cat.add_label("dog")
    cat.add_label("donut")

    x_train = [f"example {i}" for i in range(1000)]
    y_train = [{"cats": {"dog": i/1000, "donut": 1 - i/1000}} for i in range(1000)]
    train_data = list(zip(x_train, y_train))

    optimizer = model.begin_training()
    for i in range(10):
        batches = minibatch(train_data, size=compounding(16, 64, 1.001))
        losses = {}
        for batch in batches:
            x_batch, y_batch = zip(*batch)
            model.update(x_batch, y_batch, sgd=optimizer, drop=0, losses=losses)
        print(i, "loss:", losses["textcat"])
    print("example 10:", model("example 10").cats)
    print()

if __name__ == "__main__":
    print("1st time CPU:")
    train()
    print("2nd time CPU:")
    train()
    print("\nEnabling GPU\n")
    spacy.require_gpu()
    print("1st time GPU:")
    train()
    print("2nd time GPU:")
    train()

Output:

1st time CPU:
0 loss: 0.020526510332956605
1 loss: 0.2192715626588324
2 loss: 0.1541586974939264
3 loss: 0.21435572720838536
4 loss: 0.1982542650088135
5 loss: 0.19825033005452042
6 loss: 0.19787737677813766
7 loss: 0.016827800470196053
8 loss: 0.02887996903154999
9 loss: 0.02469563187116819
example 10: {'dog': 0.001906172838062048, 'donut': 0.6181842684745789}

2nd time CPU:
0 loss: 0.020526510332956605
1 loss: 0.2192715626588324
2 loss: 0.1541586974939264
3 loss: 0.21435572720838536
4 loss: 0.1982542650088135
5 loss: 0.19825033005452042
6 loss: 0.19787737677813766
7 loss: 0.016827800470196053
8 loss: 0.02887996903154999
9 loss: 0.02469563187116819
example 10: {'dog': 0.001906172838062048, 'donut': 0.6181842684745789}


Enabling GPU

1st time GPU:
0 loss: 0.022869700213050237
1 loss: 0.06781688092814875
2 loss: 0.15603950362856267
3 loss: 0.029185388615587726
4 loss: 0.04577569641696755
5 loss: 0.03271988184133079
6 loss: 0.030841199260066787
7 loss: 0.016764739026257303
8 loss: 0.023379557263069728
9 loss: 0.020565684088069247
example 10: {'dog': 0.15584374964237213, 'donut': 0.9999545812606812}

2nd time GPU:
0 loss: 0.022846033180030645
1 loss: 0.07457155887192357
2 loss: 0.1533858735638205
3 loss: 0.03846120528942265
4 loss: 0.030317590604681754
5 loss: 0.022946339027839713
6 loss: 0.040068494405659294
7 loss: 0.023592466532136314
8 loss: 0.02665060829349386
9 loss: 0.021907005400862545
example 10: {'dog': 0.15843163430690765, 'donut': 0.9288136959075928}

I tested in a new venv with everything from wheels except spacy (from master as of now). example 10 is the model cats output for the text "example 10".

example 10 for a few more GPU runs:

{'dog': 0.2435295134782791, 'donut': 0.9999375343322754}
{'dog': 0.4791581332683563, 'donut': 0.9981231093406677}
{'dog': 0.6463608145713806, 'donut': 0.016409972682595253}
{'dog': 0.14756248891353607, 'donut': 0.9230985045433044}

pip freeze: freeze.txt

I redid the test with v3 and the results are a bit more variable than I thought between CPU and GPU, but they're not that different across GPU runs.

CPU: {'dog': 0.0654868334531784, 'donut': 0.9892733693122864}
GPU 1: {'dog': 0.022449197247624397, 'donut': 0.9723042249679565}
GPU 2: {'dog': 0.02237524650990963, 'donut': 0.9726961255073547}
GPU 3: {'dog': 0.022426428273320198, 'donut': 0.9722701907157898}
GPU 4: {'dog': 0.02197781391441822, 'donut': 0.9722147583961487}

svlandeg · 2020-11-19T16:56:51Z

Thanks Adriane - the original script didn't have a model.update function which prevented reproducing this.

I was able to finally track this down to the ParametricAttention layer of the CNN model in the default textcat architecture. PR #6411 should fix this - but it requires an update of Thinc to 7.4.3 (to be released).

github-actions · 2021-10-29T00:02:29Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

adrianeboyd added feat / textcat Feature: Text Classifier gpu Using spaCy on GPU training Training and updating models labels Nov 11, 2020

adrianeboyd added the more-info-needed This issue needs more information label Nov 11, 2020

no-response bot removed the more-info-needed This issue needs more information label Nov 13, 2020

adrianeboyd added the bug Bugs and behaviour differing from documentation label Nov 16, 2020

svlandeg mentioned this issue Nov 19, 2020

Bugfix textcat reproducibility on GPU #6411

Merged

3 tasks

svlandeg mentioned this issue Nov 20, 2020

Variable results for textcat on GPU (nightly) #6416

Closed

adrianeboyd closed this as completed in #6411 Nov 23, 2020

github-actions bot locked as resolved and limited conversation to collaborators Oct 29, 2021

polm added the reproducibility Consistency, reproducibility, determinism, and randomness label Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

textcat training is not deterministic with gpu enabled #6373

textcat training is not deterministic with gpu enabled #6373

wlwg commented Nov 10, 2020

adrianeboyd commented Nov 11, 2020

svlandeg commented Nov 11, 2020

wlwg commented Nov 13, 2020

adrianeboyd commented Nov 16, 2020

svlandeg commented Nov 18, 2020

adrianeboyd commented Nov 19, 2020

svlandeg commented Nov 19, 2020 •

edited

Loading

github-actions bot commented Oct 29, 2021

textcat training is not deterministic with gpu enabled #6373

textcat training is not deterministic with gpu enabled #6373

Comments

wlwg commented Nov 10, 2020

How to reproduce the behaviour

Your Environment

adrianeboyd commented Nov 11, 2020

svlandeg commented Nov 11, 2020

wlwg commented Nov 13, 2020

adrianeboyd commented Nov 16, 2020

svlandeg commented Nov 18, 2020

adrianeboyd commented Nov 19, 2020

svlandeg commented Nov 19, 2020 • edited Loading

github-actions bot commented Oct 29, 2021

svlandeg commented Nov 19, 2020 •

edited

Loading