Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use int8 #9

Closed
RiverDong opened this issue Aug 14, 2022 · 1 comment
Closed

Cannot use int8 #9

RiverDong opened this issue Aug 14, 2022 · 1 comment

Comments

@RiverDong
Copy link

I tried to use 8xA100 to run BLOOM. But I cannot do load_in_8bit. I tried to follow the instruction here load the model by model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', load_in_8bit=True, max_memory=max_memory) Basically, if I don't have max_memory=max_memory, then most memory would go the gpu:0 and then CUDA out of memory error. If I put max_memory=max_memory, it will throw 8-bit operation are not supported under CPU.
Screen Shot 2022-08-13 at 10 45 09 PM

@TimDettmers
Copy link
Collaborator

Looking again at this error, I realize the problem is likely that you set the memory threshold too low in the max_memory. You are currently using 3 GB per GPU, for a total of 24 GB across 8 GPUs, but BLOOM needs ~180 GB of GPU memory. You can set it to ~36GB if you have A100 with 40 GB memory (or higher if you have the 80 GB ones).

We will fix the error message to note that this error appears if not enough memory is allocated for the GPU.

techthiyanes pushed a commit to techthiyanes/bitsandbytes-1 that referenced this issue Jul 7, 2023
TNTran92 pushed a commit to TNTran92/bitsandbytes that referenced this issue Mar 24, 2024
…ix_bfloat16

Enable hip_bfloat16 for optim tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants