-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot use int8 #9
Comments
Looking again at this error, I realize the problem is likely that you set the memory threshold too low in the max_memory. You are currently using 3 GB per GPU, for a total of 24 GB across 8 GPUs, but BLOOM needs ~180 GB of GPU memory. You can set it to ~36GB if you have A100 with 40 GB memory (or higher if you have the 80 GB ones). We will fix the error message to note that this error appears if not enough memory is allocated for the GPU. |
…mports Add missing imports to adam
…ix_bfloat16 Enable hip_bfloat16 for optim tests
I tried to use 8xA100 to run BLOOM. But I cannot do load_in_8bit. I tried to follow the instruction here load the model by
model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', load_in_8bit=True, max_memory=max_memory)
Basically, if I don't have max_memory=max_memory, then most memory would go the gpu:0 and then CUDA out of memory error. If I put max_memory=max_memory, it will throw 8-bit operation are not supported under CPU.The text was updated successfully, but these errors were encountered: