-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics (in nightly) in Windows x64 #120820
Conversation
…ers and Rust plans to set Windows 10 as the minimum supported OS for target x86_64-pc-windows-msvc, I have added the cmpxchg16b and sse3 feature (as CPUs that meet the Windows 10 64-bit requirement also support SSE3. See https://walbourn.github.io/directxmath-sse3-and-ssse3/ )
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @fmease (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
These commits modify compiler targets. |
This comment has been minimized.
This comment has been minimized.
Strange. |
This comment was marked as outdated.
This comment was marked as outdated.
@@ -3,6 +3,7 @@ use crate::spec::{base, SanitizerSet, Target}; | |||
pub fn target() -> Target { | |||
let mut base = base::windows_msvc::opts(); | |||
base.cpu = "x86-64".into(); | |||
base.features = "+cmpxchg16b,+sse3".into(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+sahf
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That I am not sure. x86-64-v2 onwards don't use sahf though it is listed in the features. CMPXCHG16B and SSE3 are listed. Does that mean Rustc doesn't yet make use of these instructions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you are seeing but x86-64-v2 totally does specify SAHF:
https:/llvm/llvm-project/blob/a9700904765590ca2fbf08c0cc36d0da1107d3a7/llvm/lib/Target/X86/X86.td#L808-L809
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used this
rustc --print=cfg -C target-cpu=x86-64-v2
debug_assertions
panic="unwind"
target_arch="x86_64"
target_endian="little"
target_env="msvc"
target_family="windows"
target_feature="cmpxchg16b"
target_feature="fxsr"
target_feature="popcnt"
target_feature="sse"
target_feature="sse2"
target_feature="sse3"
target_feature="sse4.1"
target_feature="sse4.2"
target_feature="ssse3"
target_has_atomic="16"
target_has_atomic="32"
target_has_atomic="64"
target_has_atomic="8"
target_has_atomic="ptr"
target_os="windows"
target_pointer_width="64"
target_vendor="pc"
windows
Did some more digging and turns out SAHF seems to be for kernel rather than userspace code. Hence it is not listed in the rustc command I ran above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also seems to be tied to virtualization and for kernel space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've been researching this a bit more. On balance I do think we should add +sahf
. While I don't think it's that important, it was seen fit to be included in x86-64-v2 feature level and I think we should align ourselves with that (although obviously dropping the SSE level back down to 3).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
Hm, using EDIT: Ah, |
Fixed a bug where adding CMPXCHG16B would fail due to different names in Rustc and LLVM
Thank you! Fixed it |
Also as CMPXCHG16B allows 128-bit/16-byte atomics, I am creating a new commit to update max atomic width as well. This is also already done for x86_64_apple_darwin target where it states: |
As CMPXCHG16B is supported, I updated the max atomic width to 128-bits from 64-bits
Hm, doesn't that also require avx for full atomics? |
x86-64 darwin (macOS) target does this without AVX support as the minimum requirement is Penryn (Core 2) which doesn't support AVX or AVX2. |
Fair enough. Maybe I was thinking of something else. |
Shouldn't this be set for all other x86 Windows targets as well? |
As for 32-bit Windows 10, it only requires SSE2 and that is already the requirement in Rustc. Now for 64-bit MinGW ABI, I could update it if that also requires Windows 10 and up. I looked at Rust platforms and I didn't see win7 version of MinGW ABI so I am unsure. However, if the -gnu targets also dropped support, I can do a new commit |
All |
Updated x86_64-uwp-windows-gnu to use CMPXCHG16B and SSE3
There are merge commits (commits with multiple parents) in your changes. We have a no merge policy so these commits will need to be removed for this pull request to be merged. You can start a rebase with the following commands:
The following commits are merge commits: |
There are merge commits (commits with multiple parents) in your changes. We have a no merge policy so these commits will need to be removed for this pull request to be merged. You can start a rebase with the following commands:
The following commits are merge commits (since this message was last posted): |
74f246c
to
1c6dda7
Compare
…enkov Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics (in nightly) in Windows x64 As Rust plans to set Windows 10 as the minimum supported OS for target x86_64-pc-windows-msvc, I have added the cmpxchg16b and sse3 feature. Windows 10 requires CMPXCHG16B, LAHF/SAHF, and PrefetchW as stated in the requirements [here](https://download.microsoft.com/download/c/1/5/c150e1ca-4a55-4a7e-94c5-bfc8c2e785c5/Windows%2010%20Minimum%20Hardware%20Requirements.pdf). Furthermore, CPUs that meet these requirements also have SSE3 ([see](https://walbourn.github.io/directxmath-sse3-and-ssse3/))
Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics (in nightly) in Windows x64 As Rust plans to set Windows 10 as the minimum supported OS for target x86_64-pc-windows-msvc, I have added the cmpxchg16b and sse3 feature. Windows 10 requires CMPXCHG16B, LAHF/SAHF, and PrefetchW as stated in the requirements [here](https://download.microsoft.com/download/c/1/5/c150e1ca-4a55-4a7e-94c5-bfc8c2e785c5/Windows%2010%20Minimum%20Hardware%20Requirements.pdf). Furthermore, CPUs that meet these requirements also have SSE3 ([see](https://walbourn.github.io/directxmath-sse3-and-ssse3/))
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
Ok so the test is checking if SSE4.2 enables CRC32 on x86-64 (base). As we passed in extra features, it fails. I am going to look into how other targets handle this when they add feature flags |
I did some digging and it is not all good news. x86_64h-apple-darwin handles this by using core-avx2 and removing some features using the "-feature" syntax. Now, the features list is evaluated the feature left to right so "-feature,+feature" should still lead to it correcting when you add the feature. The disadvantage of this approach is that it doesn't work if you change the target cpu (like choosing native) where those features don't get added back. Other platforms like x86-64 Fuschia and Android add features just fine That leaves updating the test to exclude windows target, but I am unsure on if the test should be changed. At the very least, we should add another test for windows x64. |
The other idea is setting the target cpu as nocona but that might mess up optimizations as a whole because nocona is based off of pentium 4 (far off from modern CPUs as compared to core to haswell) |
From the test failure it looks like that line changes to attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable "target-cpu"="x86-64" "target-features"="+cx16,+sse3,+sahf,+sse4.2,+crc32" } So it's fine to just change the test expectation to allow additional target features before sse4.2/crc32, like
since that upholds the spirit of the test (that adding sse4.2 also adds crc32) |
Thanks erikdesjardins! Updating the test was a lot simpler than I expected. |
Just as a heads up the updated test just needs to be reviewed |
lgtm, thanks! @bors r=petrochenkov,ChrisDenton |
…llaumeGomez Rollup of 7 pull requests Successful merges: - rust-lang#119748 (Increase visibility of `join_path` and `split_paths`) - rust-lang#120820 (Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics (in nightly) in Windows x64) - rust-lang#121000 (pattern_analysis: rework how we hide empty private fields) - rust-lang#121376 (Skip unnecessary comparison with half-open range patterns) - rust-lang#121596 (Use volatile access instead of `#[used]` for `on_tls_callback`) - rust-lang#121669 (Count stashed errors again) - rust-lang#121783 (Emitter cleanups) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#120820 - CKingX:cpu-base-minimum, r=petrochenkov,ChrisDenton Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics (in nightly) in Windows x64 As Rust plans to set Windows 10 as the minimum supported OS for target x86_64-pc-windows-msvc, I have added the cmpxchg16b and sse3 feature. Windows 10 requires CMPXCHG16B, LAHF/SAHF, and PrefetchW as stated in the requirements [here](https://download.microsoft.com/download/c/1/5/c150e1ca-4a55-4a7e-94c5-bfc8c2e785c5/Windows%2010%20Minimum%20Hardware%20Requirements.pdf). Furthermore, CPUs that meet these requirements also have SSE3 ([see](https://walbourn.github.io/directxmath-sse3-and-ssse3/))
Thank you Chris, erikdesjardins, CryZe and petrochenkov! |
Pkgsrc changes: * Adapt checksums and patches, some have beene intregrated upstream. Upstream chnages: Version 1.78.0 (2024-05-02) =========================== Language -------- - [Stabilize `#[cfg(target_abi = ...)]`] (rust-lang/rust#119590) - [Stabilize the `#[diagnostic]` namespace and `#[diagnostic::on_unimplemented]` attribute] (rust-lang/rust#119888) - [Make async-fn-in-trait implementable with concrete signatures] (rust-lang/rust#120103) - [Make matching on NaN a hard error, and remove the rest of `illegal_floating_point_literal_pattern`] (rust-lang/rust#116284) - [static mut: allow mutable reference to arbitrary types, not just slices and arrays] (rust-lang/rust#117614) - [Extend `invalid_reference_casting` to include references casting to bigger memory layout] (rust-lang/rust#118983) - [Add `non_contiguous_range_endpoints` lint for singleton gaps after exclusive ranges] (rust-lang/rust#118879) - [Add `wasm_c_abi` lint for use of older wasm-bindgen versions] (rust-lang/rust#117918) This lint currently only works when using Cargo. - [Update `indirect_structural_match` and `pointer_structural_match` lints to match RFC] (rust-lang/rust#120423) - [Make non-`PartialEq`-typed consts as patterns a hard error] (rust-lang/rust#120805) - [Split `refining_impl_trait` lint into `_reachable`, `_internal` variants] (rust-lang/rust#121720) - [Remove unnecessary type inference when using associated types inside of higher ranked `where`-bounds] (rust-lang/rust#119849) - [Weaken eager detection of cyclic types during type inference] (rust-lang/rust#119989) - [`trait Trait: Auto {}`: allow upcasting from `dyn Trait` to `dyn Auto`] (rust-lang/rust#119338) Compiler -------- - [Made `INVALID_DOC_ATTRIBUTES` lint deny by default] (rust-lang/rust#111505) - [Increase accuracy of redundant `use` checking] (rust-lang/rust#117772) - [Suggest moving definition if non-found macro_rules! is defined later] (rust-lang/rust#121130) - [Lower transmutes from int to pointer type as gep on null] (rust-lang/rust#121282) Target changes: - [Windows tier 1 targets now require at least Windows 10] (rust-lang/rust#115141) - [Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics in tier 1 Windows] (rust-lang/rust#120820) - [Add `wasm32-wasip1` tier 2 (without host tools) target] (rust-lang/rust#120468) - [Add `wasm32-wasip2` tier 3 target] (rust-lang/rust#119616) - [Rename `wasm32-wasi-preview1-threads` to `wasm32-wasip1-threads`] (rust-lang/rust#122170) - [Add `arm64ec-pc-windows-msvc` tier 3 target] (rust-lang/rust#119199) - [Add `armv8r-none-eabihf` tier 3 target for the Cortex-R52] (rust-lang/rust#110482) - [Add `loongarch64-unknown-linux-musl` tier 3 target] (rust-lang/rust#121832) Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Bump Unicode to version 15.1.0, regenerate tables] (rust-lang/rust#120777) - [Make align_offset, align_to well-behaved in all cases] (rust-lang/rust#121201) - [PartialEq, PartialOrd: document expectations for transitive chains] (rust-lang/rust#115386) - [Optimize away poison guards when std is built with panic=abort] (rust-lang/rust#100603) - [Replace pthread `RwLock` with custom implementation] (rust-lang/rust#110211) - [Implement unwind safety for Condvar on all platforms] (rust-lang/rust#121768) - [Add ASCII fast-path for `char::is_grapheme_extended`] (rust-lang/rust#121138) Stabilized APIs --------------- - [`impl Read for &Stdin`] (https://doc.rust-lang.org/stable/std/io/struct.Stdin.html#impl-Read-for-%26Stdin) - [Accept non `'static` lifetimes for several `std::error::Error` related implementations] (rust-lang/rust#113833) - [Make `impl<Fd: AsFd>` impl take `?Sized`] (rust-lang/rust#114655) - [`impl From<TryReserveError> for io::Error`] (https://doc.rust-lang.org/stable/std/io/struct.Error.html#impl-From%3CTryReserveError%3E-for-Error) These APIs are now stable in const contexts: - [`Barrier::new()`] (https://doc.rust-lang.org/stable/std/sync/struct.Barrier.html#method.new) Cargo ----- - [Stabilize lockfile v4](rust-lang/cargo#12852) - [Respect `rust-version` when generating lockfile] (rust-lang/cargo#12861) - [Control `--charset` via auto-detecting config value] (rust-lang/cargo#13337) - [Support `target.<triple>.rustdocflags` officially] (rust-lang/cargo#13197) - [Stabilize global cache data tracking] (rust-lang/cargo#13492) Misc ---- - [rustdoc: add `--test-builder-wrapper` arg to support wrappers such as RUSTC_WRAPPER when building doctests] (rust-lang/rust#114651) Compatibility Notes ------------------- - [Many unsafe precondition checks now run for user code with debug assertions enabled] (rust-lang/rust#120594) This change helps users catch undefined behavior in their code, though the details of how much is checked are generally not stable. - [riscv only supports split_debuginfo=off for now] (rust-lang/rust#120518) - [Consistently check bounds on hidden types of `impl Trait`] (rust-lang/rust#121679) - [Change equality of higher ranked types to not rely on subtyping] (rust-lang/rust#118247) - [When called, additionally check bounds on normalized function return type] (rust-lang/rust#118882) - [Expand coverage for `arithmetic_overflow` lint] (rust-lang/rust#119432) Internal Changes ---------------- These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools. - [Update to LLVM 18](rust-lang/rust#120055) - [Build `rustc` with 1CGU on `x86_64-pc-windows-msvc`] (rust-lang/rust#112267) - [Build `rustc` with 1CGU on `x86_64-apple-darwin`] (rust-lang/rust#112268) - [Introduce `run-make` V2 infrastructure, a `run_make_support` library and port over 2 tests as example] (rust-lang/rust#113026) - [Windows: Implement condvar, mutex and rwlock using futex] (rust-lang/rust#121956)
As Rust plans to set Windows 10 as the minimum supported OS for target x86_64-pc-windows-msvc, I have added the cmpxchg16b and sse3 feature. Windows 10 requires CMPXCHG16B, LAHF/SAHF, and PrefetchW as stated in the requirements here. Furthermore, CPUs that meet these requirements also have SSE3 (see)