Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Flexibility In Setting Learning Rates #707

Open
rsomani95 opened this issue Oct 25, 2023 · 0 comments
Open

More Flexibility In Setting Learning Rates #707

rsomani95 opened this issue Oct 25, 2023 · 0 comments

Comments

@rsomani95
Copy link
Contributor

With the current setup, the same learning rate is applied to non gain or bias params of the text and image encoders. It would be nice to have flexibility in setting these. For instance, the SigLIP paper gets peak performance with pretrained image encoders by disabling weight decay on the image encoder (though I'm not sure if that's the trunk, head, or both). Here's the figure from the paper for reference:
CleanShot 2023-10-25 at 17 30 28@2x

I'm not sure what the best mechanism to accomodate various use cases would be. One more useful fine-tuning setup I can imagine is setting differential learning rates for diff parts of the network.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant