-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
textcat training is not deterministic with gpu enabled #6373
Comments
Hmm, I can't reproduce this. Can you double-check by explicitly uninstalling What do you see in |
And what is your thinc version? |
@adrianeboyd @svlandeg I just wrote up a more detailed script: https://colab.research.google.com/drive/1lVJpVE-SS85jQP3LdkuZkhKvpBA0EuXM?usp=sharing |
Hmm, I do think there may be a bug of some sort here in spacy v2. Locally and with the colab example above I get consistent results within multiple CPU and GPU runs (also with our quick internal test cases related to this), but the CPU and GPU results are not similar to each other, and if I extend the training a bit I do get different results for multiple GPU runs. We will look into it! In better news, with spacy v3 I get the same results on both (minus some float rounding differences, of course). |
I'd be happy to look into this further, but I can't reproduce... :( If I run this on either CPU or GPU, I just keep getting consistent results, after installing a clean copy of The only thing I can think of right now, that this happens on Linux and not Windows? Though that makes little sense to me. @adrianeboyd : you couldn't replicate at first either - what exactly did you change to replicate this? |
Here's my test script (just adapted a bit from the one in the colab example): import spacy
from spacy.util import minibatch, compounding
def train():
spacy.util.fix_random_seed(0)
model = spacy.blank("en")
model.add_pipe(model.create_pipe("textcat"))
cat = model.get_pipe("textcat")
cat.add_label("dog")
cat.add_label("donut")
x_train = [f"example {i}" for i in range(1000)]
y_train = [{"cats": {"dog": i/1000, "donut": 1 - i/1000}} for i in range(1000)]
train_data = list(zip(x_train, y_train))
optimizer = model.begin_training()
for i in range(10):
batches = minibatch(train_data, size=compounding(16, 64, 1.001))
losses = {}
for batch in batches:
x_batch, y_batch = zip(*batch)
model.update(x_batch, y_batch, sgd=optimizer, drop=0, losses=losses)
print(i, "loss:", losses["textcat"])
print("example 10:", model("example 10").cats)
print()
if __name__ == "__main__":
print("1st time CPU:")
train()
print("2nd time CPU:")
train()
print("\nEnabling GPU\n")
spacy.require_gpu()
print("1st time GPU:")
train()
print("2nd time GPU:")
train() Output:
I tested in a new venv with everything from wheels except spacy (from
I redid the test with v3 and the results are a bit more variable than I thought between CPU and GPU, but they're not that different across GPU runs.
|
Thanks Adriane - the original script didn't have a I was able to finally track this down to the |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
How to reproduce the behaviour
This is related to #6177. I can verify that when using CPU, the training losses/weights for textcat can be deterministic with
fix_random_seed
. However, if I enable GPU viaspacy.require_gpu()
, the training losses/weights become different every time.Output:
Your Environment
The text was updated successfully, but these errors were encountered: