Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you please give an example using this model for a complete pipeline? #2951

Closed
deepdad opened this issue Nov 20, 2018 · 2 comments
Closed
Labels
usage General spaCy usage

Comments

@deepdad
Copy link

deepdad commented Nov 20, 2018

Can you please give an example using this model for a complete pipeline?

How to reproduce the behaviour

nlp = spacy.load('en_vectors_web_lg')
print(nlp.pipeline)

[]
and only the tokens are available:

for tok in doc:
        token_i +=1
        ann_toks_deps_tags.append((tok.text, tok.tag_, tok.dep_))

[('Only', '', ''), ('the', '', ''), ('tokens', '', ''), ('are', '', ''), ('available', '', '')('.', '', '')]

using any of

nlp.add_pipe(nlp.create_pipe('tokenizer'))
nlp.add_pipe(nlp.create_pipe('tagger'))
nlp.add_pipe(nlp.create_pipe('parser'))

results in errors like:

  File "/usr/local/lib/python3.6/dist-packages/spacy/language.py", line 346, in __call__
    doc = proc(doc)
TypeError: Argument 'string' has incorrect type (expected str, got spacy.tokens.doc.Doc)

Or,

cat test3.py 

import spacy
spacy.prefer_gpu()
nlp = spacy.load('en_vectors_web_lg')
print(nlp.pipeline) # []
nlp.add_pipe(nlp.create_pipe('tensorizer'))
nlp.add_pipe(nlp.create_pipe('tokenizer'))
nlp.add_pipe(nlp.create_pipe('tagger'))
nlp.add_pipe(nlp.create_pipe('parser'))
nlp.add_pipe(nlp.create_pipe('sentencizer'))
nlp.add_pipe(nlp.create_pipe('merge_noun_chunks'))
nlp.add_pipe(nlp.create_pipe('merge_entities'))

print(nlp.pipeline) #list of the above

doc = nlp(u'Do something every day.')
for tok in doc:
    token_i +=1
    print((tok.text, tok.tag_, tok.dep_))

reults in

python3 test3.py 
[]
[('tensorizer', <spacy.pipeline.Tensorizer object at 0x7f2b9575beb8>), ('Tokenizer', <spacy.tokenizer.Tokenizer object at 0x7f2b957699a8>), ('tagger', <spacy.pipeline.Tagger object at 0x7f2af258ce10>), ('parser', <spacy.pipeline.DependencyParser object at 0x7f2af257de08>), ('sbd', <spacy.pipeline.SentenceSegmenter object at 0x7f2af258cc50>), ('merge_noun_chunks', <built-in function merge_noun_chunks>), ('merge_entities', <built-in function merge_entities>)]
Traceback (most recent call last):
  File "test3.py", line 15, in <module>
    doc = nlp(u'Do something every day.')
  File "/usr/local/lib/python3.6/dist-packages/spacy/language.py", line 346, in __call__
    doc = proc(doc)
  File "pipeline.pyx", line 305, in spacy.pipeline.Tensorizer.__call__
  File "pipeline.pyx", line 329, in spacy.pipeline.Tensorizer.predict
AttributeError: 'bool' object has no attribute 'ops'

Can you please give an example using this model for a complete pipeline, including sentencizer, parser, tagger, tokenizer, etc.?

Your Environment

  • Operating System: Ubuntu 18 (I did have some issues during installation)

  • Python Version Used: 3.6

  • spaCy Version Used: v2.0.16

  • Environment Information:
    CUDA 10, Tesla V100
    Installed models (spaCy v2.0.16)
    /usr/local/lib/python3.6/dist-packages/spacy

    TYPE NAME MODEL VERSION
    package en-vectors-web-lg en_vectors_web_lg 2.0.0 ✔
    package en-core-web-lg en_core_web_lg 2.0.0 ✔
    link en_vectors_web_lg en_vectors_web_lg 2.0.0 ✔
    link en_core_web_lg en_core_web_lg 2.0.0 ✔

@honnibal
Copy link
Member

You need to use a pipeline with pre-trained models in order to use those components, or you need to train one yourself. There are many examples at https://spacy.io/usage/

@honnibal honnibal added the usage General spaCy usage label Nov 26, 2018
@lock
Copy link

lock bot commented Dec 26, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Dec 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
usage General spaCy usage
Projects
None yet
Development

No branches or pull requests

2 participants