-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finish Avx512 specific lightup for Vector128/256/512<T> #85207
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsWith #80814, we achieved functional parity of This includes:
We should also ensure that all APIs are accelerated as intrinsic, where applicable, in particular the following are still managed fallbacks (but accelerated):
There may be others as well, so a general audit to validate would be good.
|
@tannergooding Sum is a weird one. I don't believe any single avx512 instruction exists for it Documentation: Clang implementation: |
Most of this is being handled in #100993. |
|
With #80814, we achieved functional parity of
Vector512<T>
withVector128<T>
andVector256<T>
. However, there are some new instructions available in Avx512 capable hardware that will allow additional hardware acceleration opportunities for all three types.This includes:
vcvtqq2pd
&vcvtuqq2pd
vcvtpd2qq
vcvtps2udq
vcvtpd2uqq
vpternlog
vpermi2*
,vpermt2*
, etcWe should also ensure that all APIs are accelerated as intrinsic, where applicable, in particular the following are still managed fallbacks (but accelerated):
There may be others as well, so a general audit to validate would be good.
The text was updated successfully, but these errors were encountered: