Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GradScaler: Do not enable when training on CPU #623

Closed

Conversation

danieldk
Copy link
Contributor

@danieldk danieldk commented Mar 23, 2022

  • Disable gradient scaling and emit a warning when but CPU training is
    performed.
  • Disable gradient scaling and emit a warning when a PyTorch version
    without gradient scaling is used (rather than raising an exception).

See explosion/spaCy#10527

Note: putting this in draft. I'd like to add a similar check and warning to the shim.

- Disable gradient scaling and emit a warning when but CPU training is
  performed.
- Disable gradient scaling and emit a warning when a PyTorch version
  without gradient scaling is used (rather than raising an exception).

See #10527
@danieldk danieldk force-pushed the disable-gradscaler-without-cuda branch from 467e95a to 6c6463c Compare March 24, 2022 07:39
@danieldk
Copy link
Contributor Author

This turns out to be quite tricky, since Thinc may not be switched to GPU yet when PyTorchGradScaler is constructed. So, checking whether we are using a GPU in the constructor may disable gradient scaling unintentionally.

I will close this PR in favor of #624, which replaces the assertion by an exception and describes how this error can be avoided.

@danieldk danieldk closed this Mar 24, 2022
@danieldk danieldk deleted the disable-gradscaler-without-cuda branch March 24, 2022 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant