-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NER training fails: Process finished with exit code -1073741819 (0xC0000005) #2662
Comments
I've managed to reproduce the error in debug mode. It seems that the batch size is indeed the root cause as if the batch contains more than 8 items the training process would eventually fail with that error code. However, if I switch to small batches, the training process would be completed. |
I'm stuck at the exact problem for four days now. Getting that same error especially when I use training dataset of about 200 or more. I've no idea what the cause is. I'm trying to add a new entity type to a Spanish model But I'm stuck. Can someone please, please help. Thank you. Environment:
|
@Nandee89 The most likely cause is a batch that has an unusually large number of total words. How many words are in each of your documents, and how big is your batch size? |
@honnibal Thank you for responding. I'm training it on a batch size of 799 examples which has a total of 26506 words. Documents/texts are of different sizes. |
I also suspected that the dataset is too big..But I thought the larger the data the better model I would get. So I train a model on a smaller data and then try to retrain that model with another 150 batch hoping that the weights get updated. That approach lowered the quality of the model. |
https:/explosion/spacy/blob/master/examples/training/train_new_entity_type.py I'm trying to train it using the script found at the above link but with a Spanish model |
It seems that the problem is with using a pre-existing model. I trained a blank Language class with the 799 training examples and I didn't crash at all. I trained two times and worked well in both. if model is not None: |
@honnibal I used a batch size of 64 where each instance consists of approximately 15 tokens. |
This problem should now be resolved --- v2.0.17 has a fix for a memory error I think was at fault. You can already try it with |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I tried to train a new NER model based on the sample code in your site with my data (less than 4K records).
After only a few training iterations the training process suddenly fails with the following error message:
I suspected that it had something to do with the batch, size but even if I try to train it example by example the training process fails.
I could not reproduce the error during debug - so maybe it's a memory leak issue? (allocation\releasing)
Please advise
Your Environment
The text was updated successfully, but these errors were encountered: