Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No57】add_bf16_fp16 unittest for conv3d & conv3d_transpose #52195

Merged
merged 8 commits into from
Apr 25, 2023

Conversation

Difers
Copy link
Contributor

@Difers Difers commented Mar 27, 2023

PR types

Others

PR changes

APIs

Describe

为conv3d添加BF16单测, FP16单测添加反向
为从conv3d_transpose添加FP16,BF16单测

@paddle-bot
Copy link

paddle-bot bot commented Mar 27, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added contributor External developers status: proposed labels Mar 27, 2023
# TODO(wangzhongpu): support mkldnn op in dygraph mode
place = core.CUDAPlace(0) if self.has_cudnn() else core.CPUPlace()
self.check_output_with_place(
place, atol=1e-2, check_dygraph=(not self.use_mkldnn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atol在BF16下,默认值就是1e-2,可以移除掉

check_dygraph=(not self.use_mkldnn),
)

cls_name = "{0}_{1}".format(parent.__name__, "CUDNNBF16")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“CUDNNBF16” 改为“CUDNNBF16OP”

place, ['Filter'], 'Output', no_grad_set=set(['Input'])
)

cls_name = "{0}_{1}".format(parent.__name__, "CUDNNFp16")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“CUDNNFp16”改为“CUDNNFP16OP”

if core.is_compiled_with_cuda():
place = core.CUDAPlace(0)
if core.is_float16_supported(place):
self.check_output_with_place(place, atol=1e-5)
Copy link
Contributor

@Vvsmile Vvsmile Mar 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atol可以1e-5有些太低了,请使用默认值1e-3(FP16阈值内部默认1e-3,所以可以不设置此atol参数)


def test_check_output(self):
place = core.CUDAPlace(0)
self.check_output_with_place(place, atol=1e-5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bf16中atol的1e-5的设置太低了,请使用1e-2(BF16阈值内部默认1e-2,所以可以不设置此atol参数)

user_defined_grads=[numeric_grads],
)

cls_name = "{0}_{1}".format(parent.__name__, "CUDNNBF16")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"CUDNNBF16"改为“CUDNNBF16OP”

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Difers Difers force-pushed the add_bf_fp_conv branch 4 times, most recently from daa919c to 06bc17d Compare March 30, 2023 06:51
@Difers Difers force-pushed the add_bf_fp_conv branch 2 times, most recently from 71bfe22 to d9b4be3 Compare March 31, 2023 02:15
@Difers Difers changed the title 【Hackathon No57】add_conv3d_conv3dtp_bf_fp16 【Hackathon No57】add_bf16_fp16 unittest for conv3d & conv3d_transpose Mar 31, 2023
'Filter': OpTest.np_dtype_to_fluid_dtype(filter),
}
if self.is_bfloat16_op():
output = output.astype(np.float32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output应该也要用convert_float_to_uint16来转换一下

['Input'],
'Output',
no_grad_set={'Filter'},
user_defined_grads=[numeric_grads],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个地方请先用不带user_defined_grads的check_grad_with_place参数尝试一下

['Filter'],
'Output',
no_grad_set={'Input'},
user_defined_grads=[numeric_grads],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个地方请先用不带user_defined_grads的check_grad_with_place参数尝试一下

).astype("float32")

if self.is_bfloat16_op():
output = output.astype(np.float32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bf16类型下的output需要调用convert_float_to_uint16转换一下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里使用convert_float_to_uint16有一些问题,对于NHWC格式的数据它会进行一下transpose,导致check_output时expect_np与actual_np维度对应不上出错

@Difers
Copy link
Contributor Author

Difers commented Apr 5, 2023

done,辛苦再review下~

@Difers
Copy link
Contributor Author

Difers commented Apr 11, 2023

@Vvsmile 我发现convert_float_to_uint16维度转换只有四维,而test_conv3d_transpose_op.py中应为5维,就自行添加了一个,辛苦有空review下~

def test_check_grad(self):
if hasattr(self, "no_need_check_grad") and self.no_need_check_grad:
return
place = core.CUDAPlace(0) if self.has_cudnn() else core.CPUPlace()
Copy link
Contributor

@ZzSean ZzSean Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里只测CUDAPlace,且has_cudnn为False的时候好像不支持吧

'Filter': OpTest.np_dtype_to_fluid_dtype(filter),
}
if self.is_bfloat16_op():
output = convert_float_to_uint16(output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里跟之前的错误类似,当uint16的时候顺序不太对,应该先初始化为float32,计算完结果后将输入输出convert

@unittest.skipIf(
not core.is_compiled_with_cuda(), "core is not compiled with CUDA"
)
class TestConv3DCUDNNFP16(parent):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

名字需要修改下,加上Transpose


from paddle.fluid import core


def convert_float_to_uint16(float_list, data_format="NCHW"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为啥要重写一个?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用原先的,对NCHW dataf_format维度变换是np.transpose(float_list, [0, 3, 1, 2]),但这里data_format是NCDHW,维度变换应该是5维的[0, 4, 1, 2, 3]

Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit eb67710 into PaddlePaddle:develop Apr 25, 2023
ZzSean pushed a commit to ZzSean/Paddle that referenced this pull request May 5, 2023
…addlePaddle#52195)

* add test+conv3d_transpose_part2

* fix some merge error

* fix codestyle

* fix typo

* fix codestyle

* fix some error

* add redef float2uint

* fix conv3d and conv3d_transpose
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants