Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AES-NI: use target attributes for x86 32-bit intrinsics #8406

Merged
merged 2 commits into from
Oct 30, 2023

Conversation

beni-sandu
Copy link
Contributor

@beni-sandu beni-sandu commented Oct 23, 2023

Description

Add target attributes for 32-bit builds to compile with intrinsics out of the box.

This fixes 32-bit compilation issues (e.g. #8334 and #8400).

For 64-bit, assembly will still be used if cpu flags are not passed when building.

PR checklist

Please tick as appropriate and edit the reasons (e.g.: "backport: not needed because this is a new feature")

  • changelog not required
  • backport not required, AES-NI is only enabled on 64bit on 2.28
  • tests provided

Tested that it builds fine on:
Windows 10: MINGW 32-bit/64-bit (13.0.2), Clang 32-bit/64-bit (17.0.1)
Linux Ubuntu 22.04: gcc 32-bit/64-bit (11.4.0), Clang 32-bit/64-bit (14.0.0)

Let me know if there is some better way of doing this.

@tom-daubney-arm tom-daubney-arm added bug component-crypto Crypto primitives and low-level interfaces needs-ci Needs to pass CI tests size-s Estimated task size: small (~2d) priority-medium Medium priority - this can be reviewed as time permits labels Oct 24, 2023
@tom-daubney-arm
Copy link
Contributor

Hi @beni-sandu ,

Thanks a lot for the contribution. I have kicked off the CI on your PR so we will get some feedback shortly.

@tom-daubney-arm
Copy link
Contributor

Note to maintainers: Internal CI failures are spurious. OpenCI has passed. Marking ready for review.

@tom-daubney-arm tom-daubney-arm added needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review and removed needs-ci Needs to pass CI tests labels Oct 24, 2023
library/aesni.c Outdated Show resolved Hide resolved
library/aesni.h Outdated Show resolved Hide resolved
@daverodgman
Copy link
Contributor

daverodgman commented Oct 24, 2023

@beni-sandu
Copy link
Contributor Author

Thanks for the feedback @daverodgman, I have addressed your comments.

Seems there was a test for aesni 32-bit intrinsics in all.sh, I have added one for clang too, let me know if that is enough.

daverodgman
daverodgman previously approved these changes Oct 24, 2023
Copy link
Contributor

@daverodgman daverodgman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@tom-daubney-arm tom-daubney-arm removed the needs-reviewer This PR needs someone to pick it up for review label Oct 24, 2023
library/aesni.h Outdated Show resolved Hide resolved
library/aesni.c Outdated Show resolved Hide resolved
@beni-sandu
Copy link
Contributor Author

Thanks @tom-cosgrove-arm , fixed now.

Copy link
Contributor

@tom-daubney-arm tom-daubney-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@tom-daubney-arm
Copy link
Contributor

tom-daubney-arm commented Oct 25, 2023

OpenCI is reporting failures, PTAL:

[2023-10-25T09:44:50.146Z] ******************************************************************
[2023-10-25T09:44:50.146Z] * test_aesni_m32: AES tests, test intrinsics (clang) 
[2023-10-25T09:44:50.146Z] * Wed Oct 25 09:44:49 UTC 2023
[2023-10-25T09:44:50.146Z] ******************************************************************
[2023-10-25T09:44:50.714Z]   CC    aes.c
[2023-10-25T09:44:50.714Z]   CC    aesni.c
[2023-10-25T09:44:50.973Z] aesni.c:52:15: error: unknown pragma ignored [-Werror,-Wunknown-pragmas]
[2023-10-25T09:44:50.973Z] #pragma clang attribute push (__attribute__((target("pclmul,sse2,aes"))), apply_to=function)
[2023-10-25T09:44:50.973Z]               ^
[2023-10-25T09:44:51.232Z] aesni.c:414:15: error: unknown pragma ignored [-Werror,-Wunknown-pragmas]
[2023-10-25T09:44:51.232Z] #pragma clang attribute pop
[2023-10-25T09:44:51.232Z]               ^
[2023-10-25T09:44:51.232Z] 2 errors generated.
[2023-10-25T09:44:51.232Z] Makefile:312: recipe for target 'aesni.o' failed
[2023-10-25T09:44:51.232Z] make[1]: *** [aesni.o] Error 1
[2023-10-25T09:44:51.232Z] Makefile:18: recipe for target 'lib' failed
[2023-10-25T09:44:51.232Z] make: *** [lib] Error 2
[2023-10-25T09:44:51.232Z] make: *** Waiting for unfinished jobs....
[2023-10-25T09:44:51.232Z]   CC    src/helpers.c
[2023-10-25T09:44:51.232Z]   CC    src/asn1_helpers.c
[2023-10-25T09:44:51.232Z]   CC    src/psa_crypto_helpers.c
[2023-10-25T09:44:51.232Z]   CC    src/psa_exercise_key.c
[2023-10-25T09:44:51.232Z]   CC    src/bignum_helpers.c
[2023-10-25T09:44:51.492Z]   CC    src/threading_helpers.c
[2023-10-25T09:44:51.492Z]   CC    src/random.c
[2023-10-25T09:44:51.492Z]   CC    src/fake_external_rng_for_test.c
[2023-10-25T09:44:51.492Z]   CC    src/certs.c
[2023-10-25T09:44:51.492Z]   CC    src/drivers/test_driver_aead.c
[2023-10-25T09:44:51.492Z]   CC    src/drivers/test_driver_asymmetric_encryption.c
[2023-10-25T09:44:51.750Z]   CC    src/drivers/test_driver_pake.c
[2023-10-25T09:44:51.750Z]   CC    src/drivers/test_driver_key_agreement.c
[2023-10-25T09:44:51.750Z]   CC    src/drivers/test_driver_signature.c
[2023-10-25T09:44:51.750Z]   CC    src/drivers/test_driver_key_management.c
[2023-10-25T09:44:51.750Z]   CC    src/drivers/test_driver_cipher.c
[2023-10-25T09:44:51.750Z]   CC    src/drivers/hash.c
[2023-10-25T09:44:51.750Z]   CC    src/drivers/test_driver_mac.c
[2023-10-25T09:44:52.009Z]   CC    src/drivers/platform_builtin_keys.c
[2023-10-25T09:44:52.009Z]   CC    src/test_helpers/ssl_helpers.c
[2023-10-25T09:44:52.009Z] ^^^^test_aesni_m32: AES tests, test intrinsics (clang): make CC=clang CFLAGS='-m32 -Werror -Wall -Wextra' LDFLAGS='-m32' -> 2^^^^

@tom-daubney-arm tom-daubney-arm added the needs-ci Needs to pass CI tests label Oct 25, 2023
@daverodgman
Copy link
Contributor

The error is because Ubuntu 16.04 has clang 4.0, which doesn't know about the target attribute. So we'll need a check in support_test_aesni_m32 in all.sh to ensure we have a clang later than 4.0, similar to support_build_sha_armce

@beni-sandu
Copy link
Contributor Author

I think it's easier to just check the clang version inside component_test_aesni_m32, otherwise skip the clang test?
This way we still run the rest of the aesni tests if we don't have the required version, and we also don't need to add new component_ and support_ functions just for the clang test.

Something like this inside component_test_aesni_m32:

    # If required version of clang is not available, skip this test
    if command -v clang > /dev/null ; then
        # clang >= 4 is required to build with target attributes
        clang_ver="$(clang --version|grep version|sed -E 's#.*version ([0-9]+).*#\1#')"
        if [[ "${clang_ver}" -ge 4 ]] ; then
            # test the intrinsics implementation with clang
            msg "AES tests, test intrinsics (clang)"
            make clean
            make CC=clang CFLAGS='-m32 -Werror -Wall -Wextra' LDFLAGS='-m32'
            # check that we built intrinsics - this should be used by default when supported by the compiler
            ./programs/test/selftest aes | grep "AESNI code" | grep -q "intrinsics"
            grep -q "AES note: using AESNI" ./programs/test/selftest
            grep -q "AES note: built-in implementation." ./programs/test/selftest
            grep -q "AES note: using VIA Padlock" ./programs/test/selftest
            grep -q mbedtls_aesni_has_support ./programs/test/selftest
        fi
    fi

I have used >= 4 for clang version, but I am actually not sure if it's okay or needs to be > 4, and cannot double check on my side at the moment.

@tom-cosgrove-arm
Copy link
Contributor

ping @tom-daubney-arm or @daverodgman for second review?

daverodgman
daverodgman previously approved these changes Oct 26, 2023
@daverodgman daverodgman added approved Design and code approved - may be waiting for CI or backports and removed needs-ci Needs to pass CI tests needs-review Every commit must be reviewed by at least two team members, labels Oct 26, 2023
@daverodgman daverodgman added this pull request to the merge queue Oct 26, 2023
@daverodgman
Copy link
Contributor

@tom-cosgrove-arm I think no changelog because the other PR relating to the common issue #8334 has a changelog?

@daverodgman daverodgman added needs-work and removed approved Design and code approved - may be waiting for CI or backports labels Oct 26, 2023
@daverodgman daverodgman removed this pull request from the merge queue due to a manual request Oct 26, 2023
@daverodgman
Copy link
Contributor

Conflicts with #8374

@beni-sandu
Copy link
Contributor Author

Interesting, #8374 mentions that clang does not define __GNUC__ on Windows, but it did with the versions that I tested, on both Linux/Windows.

I will have another look at this soon.

@tom-cosgrove-arm
Copy link
Contributor

Interesting, #8374 mentions that clang does not define __GNUC__ on Windows, but it did with the versions that I tested, on both Linux/Windows.

Sorry, probably not very clear.

When clang is used as part of Visual Studio, it is run as the "clang cl driver", clang-cl.exe. This is real clang, but like running clang.exe --driver-mode=cl, and it behaves like cl.exe, not clang.exe. This includes defining _MSC_VER and not defining __GNUC__, but it does define __clang__

@tom-cosgrove-arm tom-cosgrove-arm dismissed stale reviews from daverodgman and themself October 26, 2023 10:35

PR needs updating to avoid conflict with #8374

This way we build with 32-bit gcc/clang out of the box.
We also fallback to assembly for 64-bit clang-cl if needed cpu
flags are not provided, instead of throwing an error.

Signed-off-by: Beniamin Sandu <[email protected]>
@beni-sandu
Copy link
Contributor Author

I had a closer look at this, and based on my tests, seems there is some inconsistency between what #8374 is describing in the details and what is actually fixing, unless I am misunderstanding something, or there is a different behaviour in environment because of e.g. versions used.

I understood that one could not compile with clang-cl at all before that patch, with or without CPU flags, but on my side, it was building fine when flags were provided, and throwing an error about missing intrinsics functions when flags are not used. With that patch, it is now throwing an error about missing CPU flags, so it's better.

I anyways reworked my first patch to include those changes and the current behaviour of the development branch in my tests was as follows:

On Windows 10:

  • Visual Studio 2022, MSVC 19.37.32825:
  • 64-bit - Builds with intrinsics out of the box
  • 32-bit - Builds with intrinsics out of the box
  • Visual Studio 2022, clang-cl 16.0.5:
  • 64-bit - Fails with error on missing CPU flags out of the box, builds with intrinsics when flags are provided
  • 32-bit - Fails with error on missing CPU flags out of the box, builds with intrinsics when flags are provided
  • Msys2, standalone mingw 13.0.2:
  • 64-bit - Builds with assembly out of the box, builds with intrinsics when flags are provided
  • 32-bit - Fails with error on missing CPU flags out of the box, builds with intrinsics when flags are provided
  • Msys2, standalone clang 17.0.1:
  • 64-bit - Builds with assembly out of the box, builds with intrinsics when flags are provided
  • 32-bit - Fails with error on missing CPU flags out of the box, builds with intrinsics when flags are provided

On Linux Ubuntu 22.04:

  • gcc 11.4.0:
  • 64-bit - Builds with assembly out of the box, builds with intrinsics when flags are provided
  • 32-bit - Fails with error on missing CPU flags out of the box, builds with intrinsics when flags are provided
  • clang 14.0.0:
  • 64-bit - Builds with assembly out of the box, builds with intrinsics when flags are provided
  • 32-bit - Fails with error on missing CPU flags out of the box, builds with intrinsics when flags are provided

Changes added by the patch in this PR:

On Windows 10:

  • Visual Studio 2022, MSVC 19.37.32825:
  • 64-bit - Similar to before
  • 32-bit - Similar to before
  • Visual Studio 2022, clang-cl 16.0.5:
  • 64-bit - Now builds with assembly out of the box (instead of throwing error about missing flags), builds with intrinsics when flags are used.
  • 32-bit - Now builds with intrinsics out of the box (instead of throwing error about missing flags)
  • Msys2, standalone mingw 13.0.2:
  • 64-bit - Similar to before
  • 32-bit - Now builds with intrinsics out of the box (instead of throwing error about missing flags)
  • Msys2, standalone clang 17.0.1:
  • 64-bit - Similar to before
  • 32-bit - Now builds with intrinsics out of the box (instead of throwing error about missing flags)

On Linux Ubuntu 22.04:

  • gcc 11.4.0:
  • 64-bit - Similar to before
  • 32-bit - Now builds with intrinsics out of the box (instead of throwing error about missing flags)
  • clang 14.0.0:
  • 64-bit - Similar to before
  • 32-bit - Now builds with intrinsics out of the box (instead of throwing error about missing flags)

Please let me know what you think.

Copy link
Contributor

@daverodgman daverodgman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough testing. LGTM

Copy link
Contributor

@tom-cosgrove-arm tom-cosgrove-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tom-cosgrove-arm tom-cosgrove-arm added approved Design and code approved - may be waiting for CI or backports and removed needs-work labels Oct 30, 2023
@daverodgman daverodgman added this pull request to the merge queue Oct 30, 2023
Merged via the queue into Mbed-TLS:development with commit b06d701 Oct 30, 2023
3 checks passed
codypiersall pushed a commit to codypiersall/pynng that referenced this pull request Jan 15, 2024
mbedtls fails to build on 32-bit Linux without passing some compiler
options. We pass them through an environment variable, which is not the
best way to do it, but here we are.

Hopefully this will be able to be removed whenever mbedtls pushes a new
release; seems that this has been fixed upstream in merge
Mbed-TLS/mbedtls#8406.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Design and code approved - may be waiting for CI or backports bug component-crypto Crypto primitives and low-level interfaces priority-medium Medium priority - this can be reviewed as time permits size-s Estimated task size: small (~2d)
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants