[FSDP2][1/n] construct NF4Tensor from bf16/fp16/fp32 #118

weifengpy · 2024-04-04T01:21:09Z

addressing to(bf16/fp16/fp32) with __torch_function__, as suggested by @cpuhrsch
NF4Tensor.dtype is the dtype from construction. For example, to_nf4(fp32_tensor) will return nf4tensor with dtype fp32
dtype=fp16/bf16/fp32 in pytest test/dtypes/test_nf4.py
tested in torchtune with pytest tests -m integration_test @rohan-varma

it brings 2 benefits

fixes Make NF4Tensor constructible from different dtypes #133 brought by @drisspg
unblock FSDP2 since grad.dtype == param.dtype

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

torchao/dtypes/nf4tensor.py

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

test/dtypes/test_nf4.py

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

weifengpy · 2024-04-17T22:02:52Z

error test/kernel/test_galore_downproj.py seems to be irrelevant. it's throws error when parsing nvidia-smi. cc @msaroufim @jeromeku since it's newly added this week

cpuhrsch · 2024-04-17T22:26:46Z

Great! Thank you! Can you try rebasing to see if CI runs green?

weifengpy · 2024-04-17T22:44:20Z

Great! Thank you! Can you try rebasing to see if CI runs green?

rebasing now

drisspg · 2024-04-17T23:03:35Z

torchao/dtypes/nf4tensor.py

+
+@implements_torch_function(torch.Tensor.to)
+def function_to_dtype(*args, **kwargs):
+ return args[0].get_original_weight().to(args[1])


nit there is a few ways you can call to that this isn't robust too I would imagine

do you mean 3 ways to call .to ? I can raise unimplemented if it helps

to(dtype, non_blocking=False, copy=False, memory_format=torch.preserve_format)

torch.to(device=None, dtype=None, non_blocking=False, copy=False, memory_format=torch.preserve_format)

torch.to(other, non_blocking=False, copy=False)

yeah, I think in theory they all can be supported its just that the other args/kwargs are getting dropped as implemented

got you. will see if I can pass in args/kwargs instead of dropping them

updated the PR to include args and kwargs instead of dropping. passing to dispatch if not implemented

drisspg

Overall looks good, thanks for pushing this through!

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

weifengpy · 2024-04-19T01:46:31Z

test/dtypes/test_nf4.py

- def test_smoketest_linear_compile(self):
- for dtype in [torch.bfloat16, torch.float16]:
- if torch.cuda.is_available() and torch.cuda.get_device_capability() < (8, 0) and dtype == torch.bfloat16:
- self.skipTest("test requires SM capability of at least (8, 0).")


test_smoketest_linear_compile is always skipped before because we exit with self.skipTest with torch.bfloat16 and did not have a chance to test torch.float16. It is fixed in this version by using @parameterize

weifengpy · 2024-04-19T01:51:09Z

test/dtypes/test_nf4.py

+ if torch.cuda.is_available() and torch.cuda.get_device_capability() < (8, 0) and dtype == torch.bfloat16:
+ self.skipTest("test requires SM capability of at least (8, 0).")
+ if version.parse(torch.__version__) < version.parse("2.3.0"):
+ self.skipTest("test requires 2.3.0 and above for tracing NF4Tensor")


starting from 2.3.0 we can trace subclass when inner tensors have different shapes than outer wrapper class. Specifically, we use symbolic_context.inner_contexts instead of symbolic_context from outer wrapper class: https:/pytorch/pytorch/blob/main/torch/_subclasses/meta_utils.py#L649

proof of concept for FSDP2 + NF4Tensor

0a13e6a

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 4, 2024

weifengpy marked this pull request as draft April 4, 2024 01:21

weifengpy mentioned this pull request Apr 4, 2024

[In Progress] FSDP2 + NF4Tensor pytorch/torchtune#651

Closed

weifengpy changed the title ~~proof of concept for FSDP2 + NF4Tensor~~ [In Progress] FSDP2 + NF4Tensor Apr 4, 2024

cpuhrsch and others added 3 commits April 4, 2024 10:53

Merge branch 'main' into main

9a56eaa

fsdp extention for tensor subclass

8180540

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

support fp32

95b03e1

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

This comment was marked as outdated.

Sign in to view

weifengpy mentioned this pull request Apr 15, 2024

Make NF4Tensor constructible from different dtypes #133

Closed

weifengpy commented Apr 15, 2024

View reviewed changes

torchao/dtypes/nf4tensor.py Outdated Show resolved Hide resolved

weifengpy and others added 8 commits April 16, 2024 14:13

Merge branch 'pytorch-labs:main' into main

3ac9d81

UNIT TEST FOR STATE DICT

38461b3

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

implement to

bc7a764

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

remove torch.override from torch function

8b1d037

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

use dtype in compile unit test

7ff6855

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

add dtype in all unit test

d9bcf71

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

keep original dtype

923bef2

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

fix linter

e15d244

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

weifengpy commented Apr 17, 2024

View reviewed changes

test/dtypes/test_nf4.py Show resolved Hide resolved

weifengpy changed the title ~~[In Progress] FSDP2 + NF4Tensor~~ [FSDP2][1/n] construct NF4Tensor from bf16/fp16/fp32 Apr 17, 2024

weifengpy added 2 commits April 17, 2024 14:43

use torch testing @parametrize

d4beb8f

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

remove unused import

f41cb3d

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

weifengpy marked this pull request as ready for review April 17, 2024 22:03

weifengpy requested review from cpuhrsch, rohan-varma and drisspg April 17, 2024 22:06

Merge branch 'pytorch-labs:main' into main

952fbdd

drisspg reviewed Apr 17, 2024

View reviewed changes

drisspg approved these changes Apr 17, 2024

View reviewed changes

sm8 for fp16

950d9fd

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

This comment was marked as outdated.

Sign in to view

remove sm check for fp16

d4eae0b

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

weifengpy marked this pull request as draft April 18, 2024 00:56

weifengpy and others added 4 commits April 18, 2024 16:12

skip 2.2.2 and below for tracing tensor subclass

9444f2c

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Merge branch 'pytorch-labs:main' into main

b2c3c02

include kwargs

9be2de3

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

raise unimplemented

2981393

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

weifengpy commented Apr 19, 2024

View reviewed changes

weifengpy marked this pull request as ready for review April 19, 2024 01:53

Merge branch 'main' into main

3ced998

cpuhrsch approved these changes Apr 19, 2024

View reviewed changes

cpuhrsch merged commit a7ff835 into pytorch:main Apr 19, 2024
13 checks passed

weifengpy mentioned this pull request Apr 19, 2024

[DO NOT REVIEW] dummy change to nf4 #147

Closed

dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024

[FSDP2][1/n] construct NF4Tensor from bf16/fp16/fp32 (pytorch#118)

ac0dd2b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FSDP2][1/n] construct NF4Tensor from bf16/fp16/fp32 #118

[FSDP2][1/n] construct NF4Tensor from bf16/fp16/fp32 #118

weifengpy commented Apr 4, 2024 •

edited

Loading

This comment was marked as outdated.

weifengpy commented Apr 17, 2024 •

edited

Loading

cpuhrsch commented Apr 17, 2024

weifengpy commented Apr 17, 2024

drisspg Apr 17, 2024

weifengpy Apr 17, 2024 •

edited

Loading

drisspg Apr 18, 2024

weifengpy Apr 18, 2024

weifengpy Apr 19, 2024

drisspg left a comment

This comment was marked as outdated.

weifengpy Apr 19, 2024 •

edited

Loading

weifengpy Apr 19, 2024 •

edited

Loading

[FSDP2][1/n] construct NF4Tensor from bf16/fp16/fp32 #118

[FSDP2][1/n] construct NF4Tensor from bf16/fp16/fp32 #118

Conversation

weifengpy commented Apr 4, 2024 • edited Loading

This comment was marked as outdated.

weifengpy commented Apr 17, 2024 • edited Loading

cpuhrsch commented Apr 17, 2024

weifengpy commented Apr 17, 2024

drisspg Apr 17, 2024

Choose a reason for hiding this comment

weifengpy Apr 17, 2024 • edited Loading

Choose a reason for hiding this comment

drisspg Apr 18, 2024

Choose a reason for hiding this comment

weifengpy Apr 18, 2024

Choose a reason for hiding this comment

weifengpy Apr 19, 2024

Choose a reason for hiding this comment

drisspg left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

weifengpy Apr 19, 2024 • edited Loading

Choose a reason for hiding this comment

weifengpy Apr 19, 2024 • edited Loading

Choose a reason for hiding this comment

weifengpy commented Apr 4, 2024 •

edited

Loading

weifengpy commented Apr 17, 2024 •

edited

Loading

weifengpy Apr 17, 2024 •

edited

Loading

weifengpy Apr 19, 2024 •

edited

Loading

weifengpy Apr 19, 2024 •

edited

Loading