Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.update call generates the error:: TypeError: object of type 'NoneType' has no len() #5181

Closed
AbhayGodbole opened this issue Mar 23, 2020 · 12 comments
Labels
bug Bugs and behaviour differing from documentation feat / ner Feature: Named Entity Recognizer more-info-needed This issue needs more information training Training and updating models

Comments

@AbhayGodbole
Copy link

Hi
I am creating a Custom Model. I am facing couple of issues....

  1. The below error started when I upgraded to 2.2.4 version. When I downgrade to 2.1 using same training data, model gets trained without any error.

nlp.update(texts, annotations, sgd=optimizer, drop=0.35, losses=losses)
  File "c:\abhay\ai\ris-auditsupoort\vir-ris-auditsupport\lib\site-packages\spacy\language.py", line 519, in update
    proc.update(docs, golds, sgd=get_grads, losses=losses, **kwargs)
  File "nn_parser.pyx", line 445, in spacy.syntax.nn_parser.Parser.update
  File "nn_parser.pyx", line 547, in spacy.syntax.nn_parser.Parser._init_gold_batch
  File "ner.pyx", line 104, in spacy.syntax.ner.BiluoPushDown.preprocess_gold
  File "ner.pyx", line 97, in spacy.syntax.ner.BiluoPushDown.has_gold
TypeError: object of type 'NoneType' has no len()

  1. When I create Model using 2.1 version the result is not same compare to the model generated in version 2.0 (not sure of exact version as this setup is on desktop at office and now I am working from home on the newly setup VDI)
  2. When I tried to install version 2.0 its not getting installed on the below environment.

(The current setup I have to do on the VDI provided by client to work from home. My previous environment was on my desktop in office where everything working fine. both model was getting created without error and the Model results where also good)

Environment

Info about spaCy

  • spaCy version: 2.2.4
  • Platform: Windows-10-10.0.17763-SP0
  • Python version: 3.7.1
@svlandeg
Copy link
Member

Thanks for the report!

Could you provide a little more context to help us debug this? If at all possible, a minimal working code snippet (+ 1 or 2 lines of sample data) that shows the error would be very useful for us to try and replicate this. We haven't seen this problem while training an NER on 2.2.4 before, but the fact that the same code does run on 2.1 certainly seems to indicate a problem with the new version.

@svlandeg svlandeg added feat / ner Feature: Named Entity Recognizer training Training and updating models more-info-needed This issue needs more information labels Mar 24, 2020
@no-response
Copy link

no-response bot commented Apr 7, 2020

This issue has been automatically closed because there has been no response to a request for more information from the original author. With only the information that is currently in the issue, there's not enough information to take action. If you're the original author, feel free to reopen the issue if you have or find the answers needed to investigate further.

@no-response no-response bot closed this as completed Apr 7, 2020
@Z-e-e
Copy link

Z-e-e commented Apr 29, 2020

@svlandeg was there are an update on this? I am running into the same issue.

@svlandeg
Copy link
Member

This issue was closed because it's virtually impossible for us to debug without a runnable code snippet + sample data that exhibits the error, but my request for this was never answered.

@j-cahill10
Copy link

j-cahill10 commented May 9, 2020

I have the exact same problem. I can confirm every once in a while the code runs as expected so that is confusing. Here is the block of code the error is coming from. My error message is identical to the original one.

with nlp.disable_pipes(*other_pipes):  # only train textcat
        optimizer = nlp.begin_training()
        print("Training the model...")
        print('{:^5}\t{:^5}\t{:^5}\t{:^5}\t{:^5}'.format('LOSS', 'P', 'R', 'F','ACC'))
        for i in range(n_iter):
            losses = {}
            batches=get_batches(train_data,'textcat')
            for batch in batches:
                texts, annotations = zip(*batch)
                nlp.update(texts, annotations, sgd=optimizer, drop=next(dropout),losses=losses)
            with textcat.model.use_params(optimizer.averages):
                scores = evaluate(nlp.tokenizer, textcat, dev_texts, dev_cats)
            print('{0:.3f}\t{1:.3f}\t{2:.3f}\t{3:.3f}\t{3:.3f}'  # print a simple table
                  .format(losses['textcat'], scores['textcat_p'],
                          scores['textcat_r'], scores['textcat_f'],scores['accuracy']))

My data set is text file with each line as such:

label1 [text...]
label0 [text...]

I am separating this corresponding lists of categories and texts. The texts are just the plain text and the categories are structured as follows:

category structure: {"cats": {"SUBJECTIVE": 1}}

@j-cahill10
Copy link

can confirm there is a stray space as a text input somewhere in my set. This was never a problem with the older version but is in the version. That is an error on my end but still an odd result in terms of error only showing up on 2.2

@svlandeg
Copy link
Member

svlandeg commented May 9, 2020

Thanks for the additional informatin @j-cahill10 ! So this gets triggered by an "empty" text (or just a whitespace) in your training data? If you remove it in preprocessing, you don't get the error anymore?

@j-cahill10
Copy link

It is triggered by an empty string. In my case it was just: ""

I had originally tried filtering for NonType elements and did not pick anything up so if you are having this issue double check the text being fed into your program.

@svlandeg svlandeg added the bug Bugs and behaviour differing from documentation label May 11, 2020
@svlandeg
Copy link
Member

I could identify and resolve the original bug, which indeed got triggered when training an NER component with an empty text.

However @j-cahill10, I think your issue is probably related, but not exactly the same, as it involves training the textcat, right? Can you print the full error log you're getting? I can't replicate it.

@svlandeg
Copy link
Member

@AbhayGodbole: your original bug should be addressed by PR #5425

@alexf-a
Copy link

alexf-a commented Jun 26, 2020

Still getting this on 2.2.4 with Python 3.7.7 and Mac 10.15.4, running the code found here https:/explosion/spaCy/blob/master/examples/training/train_ner.py

I'm running a slight modification of that script, where data is being pulled in from Prodigy and converted to Spacy's training format. But...some of the runs of the script go smoothly and some break with the exact same arguments.

Traceback (most recent call last): File "train_spacy_model.py", line 134, in <module> plac.call(main) File "/Users/Alex.Athanassakos/Codes/resolver-ai/venv/lib/python3.7/site-packages/plac_core.py", line 367, in call cmd, result = parser.consume(arglist) File "/Users/Alex.Athanassakos/Codes/resolver-ai/venv/lib/python3.7/site-packages/plac_core.py", line 232, in consume return cmd, self.func(*(args + varargs + extraopts), **kwargs) File "train_spacy_model.py", line 97, in main losses=losses, File "/Users/Alex.Athanassakos/Codes/resolver-ai/venv/lib/python3.7/site-packages/spacy/language.py", line 519, in update proc.update(docs, golds, sgd=get_grads, losses=losses, **kwargs) File "nn_parser.pyx", line 445, in spacy.syntax.nn_parser.Parser.update File "nn_parser.pyx", line 547, in spacy.syntax.nn_parser.Parser._init_gold_batch File "ner.pyx", line 104, in spacy.syntax.ner.BiluoPushDown.preprocess_gold File "ner.pyx", line 97, in spacy.syntax.ner.BiluoPushDown.has_gold TypeError: object of type 'NoneType' has no len()

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2021

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation feat / ner Feature: Named Entity Recognizer more-info-needed This issue needs more information training Training and updating models
Projects
None yet
Development

No branches or pull requests

5 participants