-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable QLoRA + FSDP2 #909
Merged
Merged
enable QLoRA + FSDP2 #909
Changes from 51 commits
Commits
Show all changes
100 commits
Select commit
Hold shift + click to select a range
e5826a1
enable LoRA + FSDP2
weifengpy 64fc870
reset params for lora weights and rope
weifengpy 0cd21c6
support lora weights checkpoint and checkpoint utils
weifengpy 589191e
fix lora meta device bug
weifengpy c801f26
save optim state dict
weifengpy 19a2d70
mark TODO
weifengpy 441da10
optimizer foreach=True for DTensor
weifengpy 750b9e5
clip grad norm
weifengpy 3d632d5
switch to ptd state dict api
weifengpy cb3abb3
add profiler
weifengpy dfcdde3
qlora 7b config
weifengpy e68804a
use torchao copy_
weifengpy b6fad93
Merge pull request #1 from weifengpy/fsdp2
weifengpy d6af9a2
enable saving checkpoint
weifengpy 7bbe522
Merge pull request #2 from weifengpy/fsdp2
weifengpy b616394
optimizer state dict: load on rank0 and broadcast
weifengpy a400497
import Optimizer
weifengpy e9de63c
resume training
weifengpy 05d3895
prepare for full test
weifengpy 7a5bb80
prepare for full test
weifengpy 64bf49c
remove profiler
weifengpy cb1bba4
passed integration test
weifengpy ac516e9
remove uncesssary change
weifengpy bfde704
Merge branch 'main' into fsdp2
weifengpy 102db31
bring back state dict validation
weifengpy 0b66651
align indent on comment
weifengpy 672aabb
remove unused import
weifengpy 6af2723
switch to ptd state dict and keep self implemented in record
weifengpy 42ad99c
clean unused code
weifengpy 74f6175
remove cuda value error
weifengpy f1b8a5e
comment on to_empty
weifengpy 36e6829
fix memory issues by switching model state dict api
weifengpy 08cd1fd
clean for review
weifengpy 559bc4d
Merge branch 'main' into fsdp2
weifengpy 2333134
fix linter
weifengpy 49a0364
fix checkpoint loading
weifengpy dc2ce02
expecttest CI depedency
weifengpy 0a604aa
ci depdencecy
weifengpy fa83140
fix CI issue
weifengpy 6203a1f
Merge branch 'main' into qlora
weifengpy 4b5a895
Merge branch 'pytorch:main' into fsdp2
weifengpy 1080e2c
Merge branch 'fsdp2' into qlora
weifengpy 1a70498
rebase qlora
weifengpy cb862e9
rebase qlora
weifengpy 21f5458
sync lora changes
weifengpy 33773bd
push qlora for perf measurement
weifengpy 483028b
fix meta init + cpu offloading
weifengpy cf42618
init RotaryPositionalEmbeddings in both fresh training and resume
weifengpy b519d50
import cpu offloading when needed
weifengpy 8600ced
FSDP(CheckpointWrapper(Model))
weifengpy b2fd531
bring back cpu offloading
weifengpy bb8a8bc
remove model.to
weifengpy db71c5c
apply nf4 when loading model state dict
weifengpy 16bf2de
move lora to cpu when cpu offloading
weifengpy df6e535
Update documentation tab to point to main instead of stable (#960)
kartikayk 5f621e1
Update tokens_per_sec to tokens_per_sec_per_gpu (#956)
kartikayk 7d92b1c
Delete init_weights_with_constant test util (#974)
ebsmothers 588871e
Sample packing for map datasets with correct RoPE encoding and no cro…
RdoubleA 1a5bf1a
Utilize compile on model rather than via torch API (#953)
joecummings ae7de20
Add better formatting for Eleuther eval results (#986)
joecummings 23cea56
updating help docs for hf-token arg in download.py (#991)
SalmanMohammadi be06efa
Fix position embeddings for Phi3 when packing + nits (#992)
RdoubleA 79ef995
Llama3-8b memory efficient full finetune (#990)
rohan-varma 5f55c16
Fix Gemma 2B model forward call (#998)
joecummings b88fa2d
fix: lora dropout applied to all models (#995)
Optimox b47ee93
fix: different rope base between phi3 and lora_phi3 (#997)
Optimox d86b454
Add support for free generation tasks in evals (#975)
joecummings 2b109f4
Filter out special tokens and placeholder tokens for Phi-3 (#983)
joecummings 9bd07a6
TorchTune --> torchtune (#1007)
joecummings f5cb12e
Support for unstructured text corpus datasets for CPT (#868)
RdoubleA a2066f9
Save adapter config and remapped adapter weights for loading into PEF…
ebsmothers 29d1761
Datasets tutorial improvements (#994)
RdoubleA 3a01d7f
Fix TypeError: tuple indices must be integers or slices, not str issu…
tambulkar 1d6b4a2
Add recipe test for llama3 (#929)
SLR722 c74c9a9
Fix the Gemma generation (#1016)
solitude-alive 00f96ff
Update chat tutorial so that it works as is (#1004)
christobill 62192df
[fix] llama3 70B_lora update commented instructions (#1030)
pbontrager ecd5e7e
Move nf4 op registration from utils to modules (#1035)
ebsmothers 99c549b
feat: add gemma7b support (#971)
Optimox 7d11a89
Llama3-70b: Full Finetune w/CPU offload + fused optimizer (#993)
rohan-varma 0080795
enable LoRA + FSDP2 (#855)
weifengpy 00360f7
Merge branch 'weifengpy-qlora' into qlora
weifengpy d8664a3
Merge branch 'main' into qlora
weifengpy f58f9b2
rebase
weifengpy b9bfd41
revert lora_finetune_distributed.py
weifengpy 7a3d9a1
rebase and register recipe
weifengpy 2835d2a
del logits to save memory
weifengpy 559b81d
fix linter
weifengpy 85f978b
gate NF4.copy_ on TorchAO==0.2.0
weifengpy dbae23c
improve torchao gating comment
weifengpy f4a8dfa
upgrade torchao to 0.2
weifengpy 10e304d
gate torchao 0.2
weifengpy e117a21
replace with lora_finetune_fsdp2
weifengpy 4bb5e0f
add llama2-70B
weifengpy 174d916
replace with qlora and lora_finetune_fsdp2 in yaml
weifengpy 5fdcefb
rename yaml to _fsdp2.yaml
weifengpy b878018
add unit test for nf4 state dict
weifengpy a8f1a9a
python 3.8 style dict union
weifengpy ae49684
validate lora sd missing
weifengpy cbb3da8
skip test if <2 gpu
weifengpy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will dequant nf4 to original weight. for QLoRA, we may not want it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I would say that "we certainly do not want it".