-
Notifications
You must be signed in to change notification settings - Fork 968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when using torchcompile option for CLIP training #726
Comments
@kkjh0723 I think it might break with gradient checkpointing? not sure there is a workaround, possibly maybe using non reentrant mode? |
I got the same error trying to run both |
@EIFY did you try forcing the non reentrant checkpointing? could look to change the default if that works... |
@rwightman No I haven't tried that. In that regard, the good news is that open_clip/src/open_clip/transformer.py Lines 320 to 322 in 91923df
pytorch/pytorch#79887 is now fixed and we should be able to do e.g. if self.grad_checkpointing and not torch.jit.is_scripting():
x = checkpoint(r, x, None, None, attn_mask, use_reentrant=False) The bad news is that other than that open_clip/src/open_clip/model.py Lines 260 to 263 in 91923df
or not supported at all: open_clip/src/open_clip/modified_resnet.py Lines 161 to 164 in 91923df
So fairly involved changes would be necessary. I will try doing the easy part and see if it at least gets past that when I get a chance. |
@rwightman OK so it turned out that
|
Is there any update on this? I am facing the same issue. |
Hello,
While I attempt to apply torchcompile option for training CLIP ViT-B-32 model, I got some error.
Below is the script to run training.
And I got the below error message.
How can I fix this issue?
Note that my pytorch version is 2.1.0 and no error occurs when I runs above script without
--torchcompile
option.The text was updated successfully, but these errors were encountered: