Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickling of PhraseMatcher not yet resolved in 2.1.3 #3494

Closed
kushalc opened this issue Mar 27, 2019 · 5 comments
Closed

Pickling of PhraseMatcher not yet resolved in 2.1.3 #3494

kushalc opened this issue Mar 27, 2019 · 5 comments
Labels
feat / matcher Feature: Token, phrase and dependency matcher feat / serialize Feature: Serialization, saving and loading help wanted Contributions welcome! more-info-needed This issue needs more information

Comments

@kushalc
Copy link

kushalc commented Mar 27, 2019

#1971 and #3248 show that pickling works, but I'm still able to repro issues in 2.1.3.

Minimal Reproducible Example

Building off #3252 unit test:

from spacy.matcher import PhraseMatcher
from spacy.lang.en import English
from spacy.compat import pickle
nlp = English()

matcher = PhraseMatcher(nlp.vocab)
matcher.add("TEST1", None, nlp("a"), nlp("b"), nlp("c"))
matcher.add("TEST2", None, nlp("d"))
data = pickle.dump(matcher, open("matcher.pickle", "wb"))
new_matcher = pickle.load(open("matcher.pickle", "rb"))
print(new_matcher(nlp("a b c")))  # prints [(9789093191506324669, 0, 1), (9789093191506324669, 1, 2), (9789093191506324669, 2, 3)]
print(matcher(nlp("a b c")))  # prints [(9789093191506324669, 0, 1), (9789093191506324669, 1, 2), (9789093191506324669, 2, 3)]

(re-start Python)

from spacy.lang.en import English
from spacy.compat import pickle
nlp = English()
new_matcher = pickle.load(open("matcher.pickle", "rb"))
new_matcher(nlp("a b c"))  # prints []

Info about spaCy

  • spaCy version: 2.1.3
  • Platform: Darwin-18.2.0-x86_64-i386-64bit
  • Python version: 3.7.2
  • Models: en
@ines ines added bug Bugs and behaviour differing from documentation feat / serialize Feature: Serialization, saving and loading feat / matcher Feature: Token, phrase and dependency matcher help wanted Contributions welcome! labels Mar 27, 2019
svlandeg added a commit to svlandeg/spaCy that referenced this issue Jul 11, 2019
@svlandeg
Copy link
Member

I can reproduce your error with the code above. However, I don't think it's a Matcher pickling issue. If you would keep the same nlp object rather than constructing a new one, you wouldn't run into this, right?

@kushalc
Copy link
Author

kushalc commented Jul 11, 2019 via email

@svlandeg
Copy link
Member

svlandeg commented Jul 11, 2019

Is it an option to serialize the nlp object (https://spacy.io/usage/saving-loading) ?

nlp1.to_disk("mynlp")
nlp2 = spacy.load("mynlp")

@svlandeg svlandeg added more-info-needed This issue needs more information and removed bug Bugs and behaviour differing from documentation labels Aug 2, 2019
@no-response
Copy link

no-response bot commented Aug 16, 2019

This issue has been automatically closed because there has been no response to a request for more information from the original author. With only the information that is currently in the issue, there's not enough information to take action. If you're the original author, feel free to reopen the issue if you have or find the answers needed to investigate further.

@no-response no-response bot closed this as completed Aug 16, 2019
@lock
Copy link

lock bot commented Sep 15, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / matcher Feature: Token, phrase and dependency matcher feat / serialize Feature: Serialization, saving and loading help wanted Contributions welcome! more-info-needed This issue needs more information
Projects
None yet
Development

No branches or pull requests

3 participants