Fix ValueError exception on empty Korean text. #4245

b1uec0in · 2019-09-06T03:56:45Z

Description

Fixed check_spaces method to return a generator of length 0 instead of 1 on empty string.

This fixes the following error issue.

File ".../spacy/lang/ko/init.py", line 72, in __call__
doc = Doc(self.vocab, words=surfaces, spaces=list(check_spaces(text, surfaces)))
File "doc.pyx", line 209, in spacy.tokens.doc.Doc.__init__
ValueError: [E027] Arguments 'words' and 'spaces' should be sequences of the same length, or 'spaces' should be left default at None. spaces should be a sequence of booleans, with True meaning that the word owns a ' ' character following it.

Types of change

Bug fix

Checklist

I have submitted the spaCy Contributor Agreement.
I ran the tests, and all new and existing tests passed.
My changes don't require a change to the documentation, or if they do, I've added all required information.

ines · 2019-09-06T08:29:37Z

Thanks a lot! 👍

Fix ValueError exception on empty Korean text.

d1edf00

svlandeg added lang / ko Korean language data and models bug Bugs and behaviour differing from documentation feat / tokenizer Feature: Tokenizer labels Sep 6, 2019

ines merged commit a55f5a7 into explosion:master Sep 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ValueError exception on empty Korean text. #4245

Fix ValueError exception on empty Korean text. #4245

b1uec0in commented Sep 6, 2019

ines commented Sep 6, 2019

Fix ValueError exception on empty Korean text. #4245

Fix ValueError exception on empty Korean text. #4245

Conversation

b1uec0in commented Sep 6, 2019

Description

Types of change

Checklist

ines commented Sep 6, 2019