Skip to content

Merging a noun_chunk slice for Hearst Pattern Detection #5450

Discussion options

You must be logged in to vote

Solved: The problem indeed is merging a noun_chunk slice of zero length.

Have developed the following to prevent zero length chunks:

`

 doc = nlp("We are using docs, spans, tokens and some other spacy features, such as merge entities, merge noun chunks and especially retokenizer")
 doc = nlp("this is a kind of magic")
 predicates = [["some"], ["some", "other"], ["such", "as"], ["especially"], ["a", "kind", "of"]]
 # zero spans were being created by the ["a", "kind", "of"]] predicate term

###### relevant patterns:
# hypernym = {"POS" : {"IN": ["NOUN", "PROPN"]}} 
# hyponym = {"POS" : {"IN": ["NOUN", "PROPN"]}}
# punct = {"IS_PUNCT": True, "OP": "?"}
# {"label" : "such_as", "pattern" : [hy…

Replies: 6 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by ines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / doc Feature: Doc, Span and Token objects
3 participants
Converted from issue

This discussion was converted from issue #5450 on December 11, 2020 00:17.