Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

dont add START and END tokens to source in CopyNet #79

Merged
merged 3 commits into from
Jun 23, 2020
Merged

Conversation

epwalsh
Copy link
Member

@epwalsh epwalsh commented Jun 22, 2020

This is a followup to our discussion here: #76 (comment)

Copy link
Contributor

@matt-gardner matt-gardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

"This may result in errors or duplicate special tokens in your source sequences. ",
UserWarning,
)
if (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning still seems reasonable.

tokenized_source.append(Token(END_SYMBOL))
if not tokenized_source:
# If the tokenized source is empty, it will cause issues downstream.
raise ConfigurationError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think ConfigurationError is the right thing here. We should probably move away from using it anywhere except in FromParams and Registrable, preferring ValueError or whatever other error is appropriate. This one looks like a ValueError to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@epwalsh epwalsh merged commit 935a2a8 into master Jun 23, 2020
@epwalsh epwalsh deleted the copynet branch June 23, 2020 00:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants