Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Update some models for AMP training #104

Merged
merged 13 commits into from
Aug 10, 2020
Merged

Conversation

epwalsh
Copy link
Member

@epwalsh epwalsh commented Jul 31, 2020

Fixes models that use an RNN cell by wrapping the forward call to the cell within an autocast(False) context.

@epwalsh
Copy link
Member Author

epwalsh commented Aug 1, 2020

Current failure may be related to pytorch/pytorch#36428, but note that we're actually using LSMTCell directly here and still seeing a failure, despite this comment which says AMP should work fine when using just RNN cells.

Copy link
Contributor

@matt-gardner matt-gardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Sorry, I missed the review request earlier somehow. Luckily, you've been on vacation, so no harm done :).

allennlp_models/generation/models/copynet_seq2seq.py Outdated Show resolved Hide resolved
overrides="{'trainer.use_amp':true,'trainer.cuda_device':0}",
)

# NOTE: as of writing this test, AMP does not work with RNNs. Hence we had
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again here, holdover from earlier.

@epwalsh epwalsh merged commit e5f5c62 into master Aug 10, 2020
@epwalsh epwalsh deleted the update-some-models-for-amp branch August 10, 2020 16:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants