Multi-GPU QLoRA? #844

cuichenx · 2024-04-23T00:13:56Z

Hi, first of all thanks for the great tutorials on lora and qlora! I was able to follow them very easily.
I was wondering if multi-gpu QLoRA is supported? I couldn't find a config file in the repo, and when I tried using the multi-gpu LoRA recipe and adding model.quantize_base=True, I get this error:

ValueError: The module has CPU parameters or buffers when `sync_module_states=True`, which requires them to be on GPU. Please specify the `device_id` argument or move the module to GPU before passing it to FSDP.

I was wondering if multi-gpu QLoRA is supported currently, or if it is on the roadmap? Thanks a lot!

The text was updated successfully, but these errors were encountered:

joecummings · 2024-04-23T00:53:57Z

Hey @cuichenx - glad you found the tutorials useful!

Currently, multi-GPU FSDP + QLoRA is not supported in torchtune, but this is something we are actively working on. Turns out it's a non-trivial combination. See this blog post from the folks over at answer.ai for some more information.

cc: @rohan-varma

cuichenx · 2024-04-23T01:37:42Z

Thanks for the fast response! Looking forward to it :)

kartikayk · 2024-04-23T01:48:38Z

@cuichenx I'd be curious to learn more about your use case. Are you looking at QLoRA instead of LoRA because of memory constraints? Or something else? My impression has been that LoRA gives a higher quality model though at slightly more memory usage. Wondering if you've tried LoRA and if this has not worked on your setup? Thanks for taking a look at torchtune! :)

cuichenx · 2024-04-23T17:09:02Z

Hi @kartikayk, I'm currently doing some exploratory studies on QLoRA vs LoRA, so I was looking for a more apples-to-apples comparison because LoRA for a larger model like 34B or 70B would need multiple GPUs. But for now I can do my studies on the smaller models.
Thanks for making this awesome framework!

kartikayk · 2024-04-23T17:18:13Z

@cuichenx sounds awesome! We'll make sure to comment on here as soon as we have this up and running!

rohan-varma · 2024-04-23T22:33:15Z

Thanks for trying out QLoRA @cuichenx and glad to hear that the tutorial is helpful!

Re: LoRA vs QLoRA, as per the tutorial and enablement PR (#478), in my experience we're actually able to get pretty good convergence w/QLoRA and match LoRA for some eval tasks, with about 50% memory savings. As mentioned though, we don't yet have the multi-GPU support and are working on support for this.

RdoubleA · 2024-07-19T05:16:44Z

This was recently added in #909 and is currently available as an experimental feature in our latest stable version. Closing as completed for now, please reopen if you run into any issues using it.

RdoubleA closed this as completed Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU QLoRA? #844

Multi-GPU QLoRA? #844

cuichenx commented Apr 23, 2024

joecummings commented Apr 23, 2024

cuichenx commented Apr 23, 2024

kartikayk commented Apr 23, 2024

cuichenx commented Apr 23, 2024

kartikayk commented Apr 23, 2024

rohan-varma commented Apr 23, 2024

RdoubleA commented Jul 19, 2024

Multi-GPU QLoRA? #844

Multi-GPU QLoRA? #844

Comments

cuichenx commented Apr 23, 2024

joecummings commented Apr 23, 2024

cuichenx commented Apr 23, 2024

kartikayk commented Apr 23, 2024

cuichenx commented Apr 23, 2024

kartikayk commented Apr 23, 2024

rohan-varma commented Apr 23, 2024

RdoubleA commented Jul 19, 2024