Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_in_8bit is not working for some huggingface model #14

Closed
sanyalsunny111 opened this issue Aug 18, 2022 · 9 comments
Closed

load_in_8bit is not working for some huggingface model #14

sanyalsunny111 opened this issue Aug 18, 2022 · 9 comments

Comments

@sanyalsunny111
Copy link

I have updated the transformers package and I am using ViLT model: https://huggingface.co/docs/transformers/model_doc/vilt#transformers.ViltForQuestionAnswering

image

I am getting this error is load_in_8bit is not integrated will all hugging face models ? Could you please let me know how to use load_in_8bit for any huggingface model not just BLOOM and T5.

image

@younesbelkada
Copy link
Collaborator

younesbelkada commented Aug 18, 2022

Hi @sanyalsunny111
Thank you very much for your message
Your initial issue is related to the fact that you did not installed the latest version of transformers. Since the new features of the library has not been released yet, you cannot retrieve these features with pip install transformers. Therefore, you have to manually install the latest version by running:
pip install git+https:/huggingface/transformers.git

However, this model does not support device_map=auto yet. This should be addressed in the PR: huggingface/transformers#18683 therefore available as soon as the fix will be validated.
If you want to use this feature, you can directly download the transformers version that contains the ViLT support. I made an example colab that you can try out here
Let me know if anything else is unclear!

@sanyalsunny111
Copy link
Author

@younesbelkada Thank you for your previous response. you rightly mentioned device map auto is not supported yet and without that we cannot run a 8 bit model. But my question is how you have used device_map="auto in the colab link you have shared in your previous comment?
image

@younesbelkada
Copy link
Collaborator

Hi @sanyalsunny111
If you follow the same installation guidelines as on the google colab I shared you, you should be able to pass "device_map=auto" without any problems

@younesbelkada
Copy link
Collaborator

Hi @sanyalsunny111 !
No worries, I think that you still didn't installed the correct version because you have your previous transformers that you probably did not removed.
Could you try this command? pip install --force git+https:/younesbelkada/transformers.git@eee3986ec37e3050c1ee94a63efb13090602eae5
Thanks!

@sanyalsunny111
Copy link
Author

Hey @younesbelkada Thank you very much sir. It is working fine.

@younesbelkada
Copy link
Collaborator

younesbelkada commented Aug 19, 2022

Great ! Very happy that you made it work! 💪 Do not hesitate to open an issue if you face into any new issue

@sanyalsunny111
Copy link
Author

Hey @younesbelkada the device_map='auto' is actually affecting the distributed data parallel (DDP). I am using 8 GPUs and trying to run a faster inference. Here is the error I am getting model = ViltForQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", device_map="auto", load_in_8bit=True)
image

Could you please suggest how to use load_in_8bit with DDP?

@younesbelkada
Copy link
Collaborator

Hi @sanyalsunny111
Thanks for your message!
Did the error happen also with "load_in_8bit=False"? Could you also share the full script to reproduce the issue?
Thanks

@sanyalsunny111
Copy link
Author

sanyalsunny111 commented Aug 19, 2022

Hey @younesbelkada Sorry to bother you with more error. Yes, with load_in_8bit=False this error happened code attached screenshot-1.
image
Now when I am not using load_in_8bit at all no error is happening so, it's safe to assume either device_map or load_in_8bit is causing the error. Here is my piece of code and here is the hugging face tutorial which my code is based upon.

techthiyanes pushed a commit to techthiyanes/bitsandbytes-1 that referenced this issue Jul 7, 2023
[FIX] passing of sparse in StableEmbedding
TNTran92 pushed a commit to TNTran92/bitsandbytes that referenced this issue Mar 24, 2024
improve the gemv 4bit accuracy by forcing the hipcub to 32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants