Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intent Parser Multi-word Entities #2944

Closed
jfantell opened this issue Nov 18, 2018 · 2 comments
Closed

Intent Parser Multi-word Entities #2944

jfantell opened this issue Nov 18, 2018 · 2 comments
Labels
usage General spaCy usage

Comments

@jfantell
Copy link

I am trying to train a model using the following script:

https:/explosion/spacy/blob/master/examples/training/train_intent_parser.py

I am providing the model this sample data:

("show me the best Marriot hotel in New York", {
'heads': [0, 0, 5, 5, 5, 0, 7, 5, 5],
'deps': ['ROOT', '-', '-', 'QUALITY', 'PLACE', 'PLACE', '-', 'LOCATION', 'LOCATION']
})

Currently, this produces the following output:

[('show', 'ROOT', 'show'), ('best', 'QUALITY', 'hotel'), ('Marriot', 'PLACE', 'hotel'), ('hotel', 'PLACE', 'show'), ('New', 'LOCATION', 'hotel'), ('York', 'LOCATION', 'hotel')]

Instead, I want it to produce this output:

[('show', 'ROOT', 'show'), ('best', 'QUALITY', 'hotel'), ('Marriot hotel', 'PLACE', 'hotel'), ('New York', 'LOCATION', 'hotel')]

I did not find any documentation on how multi-word entities such as "New York" and "Marriot hotel" can be extracted using the intent parser. Could someone please advise me as to how this could be done? Thank you for your time in advance!

@honnibal
Copy link
Member

You could try merging the named entities into one token each before training your intent parser. Alternatively, you might be better off leaving them as multiple tokens, and then dealing with the subtree afterwards. You can find docs about the merge method here: https://spacy.io/api/doc#merge

@honnibal honnibal added the usage General spaCy usage label Nov 26, 2018
@lock
Copy link

lock bot commented Dec 26, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Dec 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
usage General spaCy usage
Projects
None yet
Development

No branches or pull requests

2 participants