Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cfg parameter is empty when run spacy.load() #5137

Closed
alessandrobokan opened this issue Mar 11, 2020 · 4 comments
Closed

cfg parameter is empty when run spacy.load() #5137

alessandrobokan opened this issue Mar 11, 2020 · 4 comments
Labels
bug Bugs and behaviour differing from documentation feat / pipeline Feature: Processing pipeline and components

Comments

@alessandrobokan
Copy link

alessandrobokan commented Mar 11, 2020

How to reproduce the behaviour

My component class:

class MyComponent(object):
    name = "my_component"

    def __init__(self, nlp, **cfg):
        self.nlp = nlp
        self.categories = cfg.get("categories", "all_categories")  # cfg is EMPTY

I used entry points to add the component my_component to the pipeline. I updated the spacy_factories from setup.py. Then I run python setup.py sdist, generate the lib en_core_web_test_sm, and installed it by pip install.

However, when I run following code:

import spacy

nlp = spacy.load("en_core_web_test_sm", categories=["category1", "category2"])

the parameter cfg from __init__(self, nlp, **cfg): is EMPTY, so I can't get categories.

Why does this happen? Did I do something wrong? Is this a bug? I just followed the documentation.

OBS: **overrides is not passed here https:/explosion/spaCy/blob/master/spacy/util.py#L209

Your Environment

  • Operating System: Linux
  • Python Version Used: 3.6
  • spaCy Version Used: 2.2.3
  • Environment Information: virtualenv
@adrianeboyd adrianeboyd added the feat / pipeline Feature: Processing pipeline and components label Mar 13, 2020
@adrianeboyd
Copy link
Contributor

adrianeboyd commented Mar 13, 2020

Thanks for the report, this does look like a problem for spacy.load().

@ines may have a better idea, but my first suggestion would be that pipeline_args in overrides should be used to update the config that's passed to the component:

    for name in pipeline:
        if name not in disable:
            config = meta.get("pipeline_args", {}).get(name, {})
            config.update(overrides.get("pipeline_args", {}).get(name, {}))
            factory = factories.get(name, name)
            component = nlp.create_pipe(factory, config=config)
            nlp.add_pipe(component, name=name)

instead of:

spaCy/spacy/util.py

Lines 205 to 210 in 26a90f0

for name in pipeline:
if name not in disable:
config = meta.get("pipeline_args", {}).get(name, {})
factory = factories.get(name, name)
component = nlp.create_pipe(factory, config=config)
nlp.add_pipe(component, name=name)

I think a second possibility would be to store the overrides in nlp somewhere so it's accessible to pipeline components from this point on:

nlp = cls(meta=meta, **overrides)

@alessandrobokan
Copy link
Author

Hello guys, there is no answer from @adrianeboyd ... I think this issue is kinda important. It appear in the spacy documentation also.

Thanks 👍

@svlandeg svlandeg added the bug Bugs and behaviour differing from documentation label Mar 30, 2020
@svlandeg
Copy link
Member

Addressed by PR #5374, will be fixed in spaCy 2.3 onwards.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 6, 2021

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation feat / pipeline Feature: Processing pipeline and components
Projects
None yet
Development

No branches or pull requests

3 participants