Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_model_from_path does not pass through the list of Exclude #4707

Closed
oliviercwa opened this issue Nov 25, 2019 · 1 comment · Fixed by #4708
Closed

load_model_from_path does not pass through the list of Exclude #4707

oliviercwa opened this issue Nov 25, 2019 · 1 comment · Fixed by #4708
Labels
bug Bugs and behaviour differing from documentation feat / serialize Feature: Serialization, saving and loading

Comments

@oliviercwa
Copy link

oliviercwa commented Nov 25, 2019

How to reproduce the behaviour

1- Create a simple pipeline with a Tokenizer and one element
2- Save the pipeline and exclude the tokenizer:
nlp.to_disk(output_name, exclude = ['vocab', 'tokenizer'])
3- Load the pipeline again
nlp.from_disk(output_name, exclude = ['vocab', 'tokenizer'], disable = ['vocab', 'tokenizer'])

Result:

  File "\external\spacy\spacy\__init__.py", line 27, in load
    return util.load_model(name, **overrides)
  File "\external\spacy\spacy\util.py", line 136, in load_model
    return load_model_from_path(Path(name), **overrides)
  File "\external\spacy\spacy\util.py", line 179, in load_model_from_path
    return nlp.from_disk(model_path)
  File "\external\spacy\spacy\language.py", line 836, in from_disk
    util.from_disk(path, deserializers, exclude)
  File "\external\spacy\spacy\util.py", line 636, in from_disk
    reader(path / key)
  File "\external\spacy\spacy\language.py", line 823, in <lambda>
    p, exclude=["vocab"]
  File "tokenizer.pyx", line 389, in spacy.tokenizer.Tokenizer.from_disk
  File "\AppData\Local\Programs\Python\Python37\lib\pathlib.py", line 1193, in open
    opener=self._opener)
  File \AppData\Local\Programs\Python\Python37\lib\pathlib.py", line 1046, in _opener
    return self._accessor.open(self, flags, mode)
FileNotFoundError: [Errno 2] No such file or directory: '\train\\resources\\repository\\models\\test\\tokenizer'

Reason
the exclude list is not carried through from util.load_model_from_path to nlp.from_disk

Possible fix
inside util.py, pass the **overrides to nlp.from_disk

def load_model_from_path(model_path, meta=False, **overrides):
   . ...
    return nlp.from_disk(model_path, **overrides)

Your Environment

  • Operating System:
    Windows 10
  • Python Version Used:
    Python 3.7.4
  • spaCy Version Used:
    Spacy 2.1.9
  • Environment Information:
@adrianeboyd adrianeboyd added the feat / serialize Feature: Serialization, saving and loading label Nov 25, 2019
@ines ines added the bug Bugs and behaviour differing from documentation label Nov 25, 2019
@lock
Copy link

lock bot commented Dec 26, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Dec 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation feat / serialize Feature: Serialization, saving and loading
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants