Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load models in-place #32

Merged
merged 2 commits into from
Apr 10, 2024
Merged

Load models in-place #32

merged 2 commits into from
Apr 10, 2024

Conversation

danieldk
Copy link
Contributor

Description

Before this change, we would first construct a model from using the arguments passed to the registry function. Then we would construct it again using from_hf_hub. This was not only a performance issue, but also a correctness issues -- the model constructed through from_hf_hub could have different hyperparameters than those specified in the arguments to the registry function.

This change fixed this by using the new in-place loading support in Curated Transformers 2.0.

The addition to in-place loading also added the dtype argument to the model configuration. We also expose this argument now in v2 versions of the registry functions. The configuration filling is also updated to fill the data type from the torch_dtype option in the HF model configuration.

Types of change

Bugfix/feature

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@danieldk danieldk added bug Something isn't working enhancement New feature or request labels Apr 10, 2024
Before this change, we would first construct a model from using the
arguments passed to the registry function. Then we would construct it
again using `from_hf_hub`. This was not only a performance issue, but
also a correctness issues -- the model constructed through `from_hf_hub`
could have different hyperparameters than those specified in the
arguments to the registry function.

This change fixed this by using the new in-place loading support in
Curated Transformers 2.0.

The addition to in-place loading also added the `dtype` argument to the
model configuration. We also expose this argument now in v2 versions of
the registry functions. The configuration filling is also updated to
fill the data type from the `torch_dtype` option in the HF model
configuration.
Otherwise fails on GPU, but the difference is too
small to make a difference.
@danieldk
Copy link
Contributor Author

GPU test run: ✅

@danieldk danieldk merged commit 89ce8f7 into explosion:v4 Apr 10, 2024
7 of 8 checks passed
@danieldk danieldk deleted the feature/inplace branch April 10, 2024 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants