Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc: Add support for some more x86 SIMD ops #45367

Merged
merged 1 commit into from
Nov 26, 2017

Conversation

alexcrichton
Copy link
Member

This commit adds compiler support for two basic operations needed for binding
SIMD on x86 platforms:

  • First, a nontemporal_store intrinsic was added for the _mm_stream_ps, seen
    in Rust support for nontemporal stores? stdarch#114. This was relatively straightforward and is
    quite similar to the volatile store intrinsic.

  • Next, and much more intrusively, a new type to the backend was added. The
    x86_mmx type is used in LLVM for a 64-bit vector register and is used in
    various intrinsics like _mm_abs_pi8 as seen in ssse3 _mm_abs_pi8 : Intrinsic has incorrect return type! stdarch#74.
    This new type was added as a new layout option as well as having support added
    to the trans backend. The type is enabled with the #[repr(x86_mmx)]
    attribute which is intended to just be an implementation detail of SIMD in
    Rust.

I'm not 100% certain about how the x86_mmx type was added, so any extra eyes
or thoughts on that would be greatly appreciated!

@rust-highfive
Copy link
Collaborator

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

cc @eddyb

@kennytm kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 18, 2017
@@ -1022,6 +1022,9 @@ pub enum Layout {
count: u64
},

/// The `x86_mmx` type, structs marked with `#[repr(x86_mmx)]`
X86Mmx,
Copy link
Member

@eddyb eddyb Oct 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition would be to use Vector and special-case it in rustc_trans for LLVM.
#45225 changes things a bit, and I'm not sure I'm uncomfortable merging something before then, as there would be roughly a dozen individual rebase conflicts, but that's the worst part, figuring out a coherent design is.

All the FFI ABI decisions in #45225 are done without involving the Rust type, intentionally, so probably the best compromise right now would be to have an x86_mmx bool inside Vector.

EDIT: Actually, is it always an u64 and not semantically a vector in any way (aside from what LLVM does with its own type)? Because then we have another option, putting it in Scalar, which is a struct in #45225, and would require less work.

To make a relatively optimal decision here I'd suggest completely ignoring LLVM as a target and focusing on the closest match in terms of memory access and call ABIs.
Also, do we need this in anything other than intrinsics? Because for platform intrinsics we already transform the arguments and return, so there's no real need for matching Rust types (yes not even for #[repr(simd)], although we do use it right now).

@@ -139,7 +139,8 @@ impl ArgAttributes {
pub enum RegKind {
Integer,
Float,
Vector
Vector,
X86Mmx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (x86_mmx: bool inside Vector), with the caveat that the system here is subject to change/removal in the future.

@alexcrichton
Copy link
Member Author

Good points @eddyb! I've updated with a slightly-hacky solution, but I think should probably be adequate for now.

@kennytm
Copy link
Member

kennytm commented Oct 19, 2017

The error index test failed without any error message. This seems to be related to the test case around E0509 to E0517 as the test stopped there.

Testing error-index stage2
[01:32:37] doc tests for: /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md
[01:33:11] 
[01:33:11] 
[01:33:11] command did not execute successfully: "/checkout/obj/build/bootstrap/debug/rustdoc" "--test" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md" "--test-args" ""
[01:33:11] expected success, got: exit code: 1
[01:33:11] 
[01:33:11] stdout ----
[01:33:11] 
[01:33:11] running 677 tests
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0003::_::Note__this_error_code_is_no_longer_emitted_by_the_compiler_ (line 71) ... ok
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0002::_::Note__this_error_code_is_no_longer_emitted_by_the_compiler_ (line 55) ... ok
...
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0511 (line 9584) ... ignored
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0508 (line 9469) ... ok
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0512 (line 9621) ... ok
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0509 (line 9525) ... ok
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0509 (line 9551) ... ok
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0516 (line 9649) ... ok
[01:33:11] test /checkout/obj/build/x86_64-unknown-linux-gnu/test/error-index.md - Rust_Compiler_Error_Index::E0517 (line 9672) ... ok
[01:33:11] 
[01:33:11] stderr ----
[01:33:11] 
[01:33:11] 
[01:33:11] 
[01:33:11] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:33:11] Build completed unsuccessfully in 0:40:44
[01:33:11] make: *** [check] Error 1
[01:33:11] Makefile:52: recipe for target 'check' failed

I suspect E0511 since only it touches SIMD.

// should be run-pass
#![feature(repr_simd)]
#![feature(platform_intrinsics)]

#[repr(simd)]
#[derive(Copy, Clone)]
struct i32x1(i32);

extern "platform-intrinsic" {
    fn simd_add<T>(a: T, b: T) -> T;
}

unsafe { simd_add(i32x1(0), i32x1(1)); } // ok!

@kennytm kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 19, 2017
// much else.
Layout::Vector { count, .. } => {
let size = self.size(ccx);
let x86_mmx = count == 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also check size.bits() == 64 and maybe the architecture? Otherwise this is a neat hack!

@@ -178,7 +178,13 @@ pub fn in_memory_type_of<'a, 'tcx>(cx: &CrateContext<'a, 'tcx>, t: Ty<'tcx>) ->
}
let llet = in_memory_type_of(cx, e);
let n = t.simd_size(cx.tcx()) as u64;
Type::vector(&llet, n)
// see comment in abi.rs where we set `x86_mmx` to true for why we
// compare to 1 here and return a different type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'd expect this to have the "main copy" of the comment and abi.rs to point to it. Not a big deal though. The same thing about checking the size applies here too I guess.

@alkis
Copy link
Contributor

alkis commented Oct 19, 2017

I think we should not care about MMX. One can do anything that can be done with MMX with SSE2. All x86 chips produced 2000 onwards support SSE2. Moreover MMX is full of traps (reusing floating point registers, etc).

Moreover I think the absolute minimum should be SSSE3 (NetBurst) which is 17 years old now. And this is if we want to run on dinosaurs.

My real recommendation is to forget anything built past the last 7 years and require that x86 supports at minimum SSE42+AES-NI. This means Westmere onwards which is 7 years old now. SSE42 is already decent SIMD and AES-NI is extremely nice to have by default. Surely it won't work for everyone; that said 7 years is an eternity on hardware. So we should be weighing the cost benefit ratio of these choices. How many usecases we target with supporting anything before Westmere? Also consider this number is diminishing year over year.

TLDR: do not add support for MMX, require at least SSSE3 or preferably SSE42+AES-NI.

@alexcrichton
Copy link
Member Author

@alkis thanks for the comment! To be clear though this is just providing support for the stdsimd crate which is exposing the same interface that C/C++ have (e.g. the official intrinsics by Intel). We're not going to start using this in the compiler, for example, any time soon. In that sense sounds this sounds like some great documentation to discourage their use!

@alexcrichton
Copy link
Member Author

@eddyb updated!

@kennytm kennytm added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 19, 2017
@alkis
Copy link
Contributor

alkis commented Oct 19, 2017

Why do we want to expose mmx APIs through stdsimd? (I assume stdsimd is https:/rust-lang-nursery/simd, correct?)

@alexcrichton
Copy link
Member Author

@alkis the job of stdsimd (this crate) it to provide all platform intrinsics, and the platform intrinsics apparently cover mmx pieces. It's not really up to us to decide what or what not to expose, it's up to authors what to use.

@alkis
Copy link
Contributor

alkis commented Oct 20, 2017

It's not really up to us to decide what or what not to expose, it's up to authors what to use.

This statement resonates with me - we don't want to cherry-pick functionality unless there is a very good reason for it. This is the same approach we took when designing a SIMD API for C++. We excluded MMX and everything that would be slow to implement on top of intrinsics. MMX is dead weight and adding support for it is a disservice to your users:

  1. You are adding a 64bit vector type which is unnecessary complexity. stdsimd (and by extent) the standard library gets bloated for (I assume) the sake of being "complete" because Intel happens to still provide these obsolete intrinsics in their libraries.
  2. Providing MMX gives the false notion to the inexperienced SIMD user that MMX is a valid choice for SIMD when it is a really bad choice. Authors might not read Intel’s manuals in depth and the intrinsics guide does not mention MMX downsides (aliasing with x87 registers).

The same care applied to the standard library for minimal, interoperable and ergonomic APIs of which Rust does a good job at, should apply here as well. For MMX the costs of adding are high because they introduce more types and surprising APIs (not to mention there is extra work done to add support for it) and the benefits of adding it are arguably negative.

On the other end of the spectrum if you decide to eschew support for older architectures you can provide added value to users. Saying rust has AES on every platform you support is a pretty nice guarantee and will have tangible benefits on random number generation, security, etc. Doing AES in software is risky and troublesome, you need to be careful to avoid secret-dependent branches and loads. Relying on its presence in hardware can be much safer. For me the cost benefit ratio of that decision is as clear as that of dropping MMX support.

As mentioned, I was recently involved in the design of a SIMD API for C++ as a secondary author/reviewer. The primary author is quite the expert in SIMD development. The library is open source but the open source version is only periodically pushed to github. As I understand the spirit of stdsimd, it has very similar goals as this library. Perhaps you can take a look at the design and reference - maybe you can steal some of the good ideas.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Oct 20, 2017

I don't really have a dog in this fight but a couple quick notes:

  • The MMX registers (and thus, the type being added here) are not only used for MMX-level instructions, but also for some operations in later extensions. Specifically, ssse3 _mm_abs_pi8 : Intrinsic has incorrect return type! stdarch#74 encountered the lack of such a type while trying to bind an SSE3 operation. I don't know if that operation is one that should be used, but at least it can't be dismissed as easily based on the age of the instruction set.
  • stdsimd does not attempt to provide higher-level abstractions like the library you are involved in, it provides all the vendor intrinsics. And as such, it already has a massive surface area – I haven't checked but I expect that the number of MMX intrinsics is a drop in the bucket, relatively speaking.
  • Consider that Rust currently supports (at tier 2, but still) Windows XP and x86 processors without any SSE extensions. While this does not mean everything "in between" needs to be supported well (and it's certainly possible to drop support for those platforms in the future), I think it does illustrate that the threshold for "too old to bother supporting" is rather high.

@oyvindln
Copy link
Contributor

oyvindln commented Oct 21, 2017

My real recommendation is to forget anything built past the last 7 years and require that x86 supports at minimum SSE42+AES-NI.

Requiring this seems way too early still, e.g looking at the steam hw survey (Look under "other settings"), if this survey is somewhat accurate, about 91% of steam users have a SSE42-capable processor. I.e even amongst people that play games and would probably have way more up to date computers than the average user, 8-9% don't have a processor supporting these instructions, which seems way to high to consider dropping support for.

Granted, this is probably the wrong thread to be discussing this.

@jan-wassenberg
Copy link

Specifically, rust-lang/stdarch#74 encountered the lack of such a type while trying to bind an SSE3 operation. I don't know if that operation is one that should be used

abs_pi8 is abs() backported to MMX for whatever reason. There is no need to use it, abs_epi8 does twice as much work at the same (compatibility/runtime) cost.

And as such, it already has a massive surface area – I haven't checked but I expect that the number of MMX intrinsics is a drop in the bucket, relatively speaking.

True, but in absolute terms ~140 intrinsics add unnecessary clutter. Also, the "intrusive" x86_mmx backend changes could be avoided.

I agree it makes sense to drop MMX, including any subsequent instructions that still use its registers.

@alexcrichton
Copy link
Member Author

Thanks again for the input everyone! I personally still feel that we should stay the current course of "bind all the intrinsics", but keep in mind that everything here is unstable and will go through more discussion before stabilization. In that sense I think this is certainly all highly relevant for stabilization!

@alexcrichton
Copy link
Member Author

Also ping r? @arielb1

@bors
Copy link
Contributor

bors commented Nov 21, 2017

⌛ Testing commit 74dd1c2 with merge db393cf3f409943fada59a18409ebf8c46001118...

@bors
Copy link
Contributor

bors commented Nov 21, 2017

💔 Test failed - status-travis

@kennytm
Copy link
Member

kennytm commented Nov 22, 2017

@bors retry #42117

@bors
Copy link
Contributor

bors commented Nov 22, 2017

⌛ Testing commit 74dd1c2 with merge 7eb90e4...

bors added a commit that referenced this pull request Nov 22, 2017
rustc: Add support for some more x86 SIMD ops

This commit adds compiler support for two basic operations needed for binding
SIMD on x86 platforms:

* First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen
  in rust-lang/stdarch#114. This was relatively straightforward and is
  quite similar to the volatile store intrinsic.

* Next, and much more intrusively, a new type to the backend was added. The
  `x86_mmx` type is used in LLVM for a 64-bit vector register and is used in
  various intrinsics like `_mm_abs_pi8` as seen in rust-lang/stdarch#74.
  This new type was added as a new layout option as well as having support added
  to the trans backend. The type is enabled with the `#[repr(x86_mmx)]`
  attribute which is intended to just be an implementation detail of SIMD in
  Rust.

I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes
or thoughts on that would be greatly appreciated!
@bors
Copy link
Contributor

bors commented Nov 22, 2017

💔 Test failed - status-travis

@kennytm
Copy link
Member

kennytm commented Nov 22, 2017

armhf-gnu failed... because the new tests are not supposed to be run on ARM 😆

[01:01:21] failures:
[01:01:21] 
[01:01:21] ---- [codegen] codegen/x86_mmx.rs stdout ----
[01:01:21] 	
[01:01:21] error: verification with 'FileCheck' failed
[01:01:21] status: exit code: 1
[01:01:21] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/llvm/build/bin/FileCheck" "--input-file" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/x86_mmx.ll" "/checkout/src/test/codegen/x86_mmx.rs"
[01:01:21] stdout:
[01:01:21] ------------------------------------------
[01:01:21] 
[01:01:21] ------------------------------------------
[01:01:21] stderr:
[01:01:21] ------------------------------------------
[01:01:21] /checkout/src/test/codegen/x86_mmx.rs:22:18: error: expected string not found in input
[01:01:21]  // CHECK-LABEL: define x86_mmx @a(x86_mmx*{{.*}}, x86_mmx{{.*}})
[01:01:21]                  ^
[01:01:21] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/x86_mmx.ll:1:1: note: scanning from here
[01:01:21] ; ModuleID = 'x86_mmx0-8787f43e282added376259c1adb08b80.rs'
[01:01:21] ^
[01:01:21] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/x86_mmx.ll:1:5: note: possible intended match here
[01:01:21] ; ModuleID = 'x86_mmx0-8787f43e282added376259c1adb08b80.rs'
[01:01:21]     ^
[01:01:21] 
[01:01:21] ------------------------------------------
[01:01:21] 
[01:01:21] thread '[codegen] codegen/x86_mmx.rs' panicked at 'explicit panic', /checkout/src/tools/compiletest/src/runtest.rs:2541:8
[01:01:21] note: Run with `RUST_BACKTRACE=1` for a backtrace.

@kennytm kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 22, 2017
@alexcrichton
Copy link
Member Author

@bors: r=eddyb

@bors
Copy link
Contributor

bors commented Nov 25, 2017

📌 Commit 952f6e9 has been approved by eddyb

@bors
Copy link
Contributor

bors commented Nov 25, 2017

⌛ Testing commit 952f6e9fca35564bed8ffe436e80618c60a6f59e with merge f5520ed945f4e729b4edf8650db73b8d6890aff8...

@bors
Copy link
Contributor

bors commented Nov 25, 2017

💔 Test failed - status-travis

@MaloJaffre
Copy link
Contributor

CI failed on asmjs because src/test/codegen/x86_mmx.rs is not ignored on this platform:

[01:57:10] failures:
[01:57:10] 
[01:57:10] ---- [codegen] codegen/x86_mmx.rs stdout ----
[01:57:10] 	
[01:57:10] error: verification with 'FileCheck' failed
[01:57:10] status: exit code: 1
[01:57:10] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/llvm/build/bin/FileCheck" "--input-file" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/x86_mmx.ll" "/checkout/src/test/codegen/x86_mmx.rs"
[01:57:10] stdout:
[01:57:10] ------------------------------------------
[01:57:10] 
[01:57:10] ------------------------------------------
[01:57:10] stderr:
[01:57:10] ------------------------------------------
[01:57:10] /checkout/src/test/codegen/x86_mmx.rs:24:18: error: expected string not found in input
[01:57:10]  // CHECK-LABEL: define x86_mmx @a(x86_mmx*{{.*}}, x86_mmx{{.*}})
[01:57:10]                  ^
[01:57:10] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/x86_mmx.ll:1:1: note: scanning from here
[01:57:10] ; ModuleID = 'x86_mmx0-8787f43e282added376259c1adb08b80.rs'
[01:57:10] ^
[01:57:10] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/x86_mmx.ll:1:5: note: possible intended match here
[01:57:10] ; ModuleID = 'x86_mmx0-8787f43e282added376259c1adb08b80.rs'
[01:57:10]     ^
[01:57:10] 
[01:57:10] ------------------------------------------
[01:57:10] 
[01:57:10] thread '[codegen] codegen/x86_mmx.rs' panicked at 'explicit panic', /checkout/src/tools/compiletest/src/runtest.rs:2570:8
[01:57:10] note: Run with `RUST_BACKTRACE=1` for a backtrace.

This commit adds compiler support for two basic operations needed for binding
SIMD on x86 platforms:

* First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen
  in rust-lang/stdarch#114. This was relatively straightforward and is
  quite similar to the volatile store intrinsic.

* Next, and much more intrusively, a new type to the backend was added. The
  `x86_mmx` type is used in LLVM for a 64-bit vector register and is used in
  various intrinsics like `_mm_abs_pi8` as seen in rust-lang/stdarch#74.
  This new type was added as a new layout option as well as having support added
  to the trans backend. The type is enabled with the `#[repr(x86_mmx)]`
  attribute which is intended to just be an implementation detail of SIMD in
  Rust.

I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes
or thoughts on that would be greatly appreciated!
@alexcrichton
Copy link
Member Author

@bors: r=eddyb

@bors
Copy link
Contributor

bors commented Nov 25, 2017

📌 Commit fe53a81 has been approved by eddyb

@kennytm
Copy link
Member

kennytm commented Nov 25, 2017

To be safe I think you need to ignore everything like src/test/codegen.rs

// ignore-aarch64
// ignore-aarch64_be
// ignore-arm
// ignore-armeb
// ignore-avr
// ignore-bpfel
// ignore-bpfeb
// ignore-hexagon
// ignore-mips
// ignore-mipsel
// ignore-mips64
// ignore-mips64el
// ignore-msp430
// ignore-powerpc64
// ignore-powerpc64le
// ignore-powerpc
// ignore-r600
// ignore-amdgcn
// ignore-sparc
// ignore-sparcv9
// ignore-sparcel
// ignore-s390x
// ignore-tce
// ignore-thumb
// ignore-thumbeb
// ignore-xcore
// ignore-nvptx
// ignore-nvptx64
// ignore-le32
// ignore-le64
// ignore-amdil
// ignore-amdil64
// ignore-hsail
// ignore-hsail64
// ignore-spir
// ignore-spir64
// ignore-kalimba
// ignore-shave
// ignore-wasm32
// ignore-wasm64
// ignore-emscripten

Or, maybe, could we just add #![cfg(any(target_arch = "x86", target_arch = "x86_64"))]?

@kennytm kennytm added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 25, 2017
@bors
Copy link
Contributor

bors commented Nov 25, 2017

⌛ Testing commit fe53a81 with merge 128b40f...

bors added a commit that referenced this pull request Nov 25, 2017
rustc: Add support for some more x86 SIMD ops

This commit adds compiler support for two basic operations needed for binding
SIMD on x86 platforms:

* First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen
  in rust-lang/stdarch#114. This was relatively straightforward and is
  quite similar to the volatile store intrinsic.

* Next, and much more intrusively, a new type to the backend was added. The
  `x86_mmx` type is used in LLVM for a 64-bit vector register and is used in
  various intrinsics like `_mm_abs_pi8` as seen in rust-lang/stdarch#74.
  This new type was added as a new layout option as well as having support added
  to the trans backend. The type is enabled with the `#[repr(x86_mmx)]`
  attribute which is intended to just be an implementation detail of SIMD in
  Rust.

I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes
or thoughts on that would be greatly appreciated!
@bors
Copy link
Contributor

bors commented Nov 26, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: eddyb
Pushing 128b40f to master...

@bors bors merged commit fe53a81 into rust-lang:master Nov 26, 2017
@alexcrichton alexcrichton deleted the simd-llvm-changes branch December 6, 2017 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.