Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support inlining cross-crate TLS access on MSVC #84933

Closed
alexcrichton opened this issue May 4, 2021 · 4 comments · Fixed by #108089
Closed

Support inlining cross-crate TLS access on MSVC #84933

alexcrichton opened this issue May 4, 2021 · 4 comments · Fixed by #108089
Labels
A-thread-locals Area: Thread local storage (TLS) O-windows-msvc Toolchain: MSVC, Operating system: Windows

Comments

@alexcrichton
Copy link
Member

There exists a __getit function as part of the thread_local! implementation in the standard library. On the MSVC target this function is not #[inline], meaning that it can't get inlined across crates. This issue is about fixing that, for some scenarios, in the future.

The #[inline] attribute was added in #84876 and, like historical attempts it does not apply the attribute for MSVC targets. The reason for this is that MSVC appears to not support importing thread-local variables across a DLL boundary.

For example this C code:

__declspec(dllimport, thread) extern int a;

int get() {
    return a;
}

yields this error when compiled by Clang:

<source>:1:42: error: 'a' cannot be thread local when declared 'dllimport'
__declspec(dllimport, thread) extern int a;
                                         ^
1 error generated.

I do not personally know the history of this. Empirically this appears to be true, if #[inline] is applied to __getit then the compiler segfaults on MSVC. The Rust compiler generally applies dllimport for you and tries to infer when this would otherwise happen, which probably makes matters worse here.

To reproduce this in Rust it's relatively simple. First you'll need to acquire a patched compiler which applies #[inline] to the __getit function, then compile a library:

$ cat driver.rs
pub fn go() {
    std::collections::HashMap::<i32, i32>::new();
}
$ rustc driver.rs --crate-type dylib
$ cat main.rs
fn main() {
    driver::go();
}
$ rustc main.rs --extern driver=driver.dll

This binary will nondeterministically crash. The crash signature is variable, too. Here the HashMap constructor currently uses thread locals internally which is what triggers this bug.

In any case things appear to not work with #[inline], notably across DLL boundaries. What can work, however, is inlining thread local accesses that don't cross DLL boundaries. Ideally we would have some sort of conditional #[inline] attribute for this. Nothing of this form exists in rustc today, and implementing this is definitely quite a stretch. In any case though I wanted to file an issue about this possible improvement to libstd.

@alexcrichton alexcrichton added O-windows-msvc Toolchain: MSVC, Operating system: Windows A-thread-locals Area: Thread local storage (TLS) labels May 4, 2021
@Aaron1011
Copy link
Member

Can't LLVM decide to inline __getit even without an #[inline] annotation (e.g. during LTO)?

@alexcrichton
Copy link
Member Author

Indeed! The problem is inlining across DLL boundaries, not across compilation unit boundaries. In Rust that can equate to cross-crate linking in the case of the compiler, but for almost all other Rust projects there are no DLL boundaries. This issue is about improving libstd for all other crates other than rustc.

@alexcrichton
Copy link
Member Author

Some more detailed information about how this might be supported is here -- #84876 (comment)

@ChrisDenton
Copy link
Member

So to recap, the issue is the dylib crate type? This produces a kind of hybrid library that's part dynamic and part static. This means that code can effectively be moved across normal module boundaries while the associated statics (or thread locals in this case) remain behind. To be honest, I'm (pleasantly) surprised this doesn't cause more problems.

I have attempted to implement the above suggestion here and it appears to work in the simple case at least (fair warning: the linked crates are for my own education and not something that's production ready).

Though to be honest (and as Alex already mentioned) whatever the solution, I'd prefer if it could somehow be special cased for dylibs so as not to affect all other crate types. I'm not sure how that would be done as there's (IIRC) no if cfg for crate types.

Manishearth added a commit to Manishearth/rust that referenced this issue Nov 22, 2022
Forbid inlining `thread_local!`'s `__getit` function on Windows

Sadly, this will make things slower to avoid UB in an edge case, but it seems hard to avoid... and really whenever I look at this code I can't help but think we're asking for trouble.

It's pretty dodgy for us to leave this as a normal function rather than `#[inline(never)]`, given that if it *does* get inlined into a dynamically linked component, it's extremely unsafe (you get some other thread local, or if you're lucky, crash). Given that it's pretty rare for people to use dylibs on Windows, the fact that we haven't gotten bug reports about it isn't really that convincing. Ideally we'd come up with some kind of compiler solution (that avoids paying for this cost when static linking, or *at least* for use within the same crate...), but it's not clear what that looks like.

Oh, and because all this is only needed when we're implementing `thread_local!` with `#[thread_local]`, this patch adjusts the `cfg_attr` to be `all(windows, target_thread_local)` as well.

r? `@ChrisDenton`

See also rust-lang#84933, which is about improving the situation.
JohnTitor pushed a commit to JohnTitor/rust that referenced this issue Nov 22, 2022
Forbid inlining `thread_local!`'s `__getit` function on Windows

Sadly, this will make things slower to avoid UB in an edge case, but it seems hard to avoid... and really whenever I look at this code I can't help but think we're asking for trouble.

It's pretty dodgy for us to leave this as a normal function rather than `#[inline(never)]`, given that if it *does* get inlined into a dynamically linked component, it's extremely unsafe (you get some other thread local, or if you're lucky, crash). Given that it's pretty rare for people to use dylibs on Windows, the fact that we haven't gotten bug reports about it isn't really that convincing. Ideally we'd come up with some kind of compiler solution (that avoids paying for this cost when static linking, or *at least* for use within the same crate...), but it's not clear what that looks like.

Oh, and because all this is only needed when we're implementing `thread_local!` with `#[thread_local]`, this patch adjusts the `cfg_attr` to be `all(windows, target_thread_local)` as well.

r? ``@ChrisDenton``

See also rust-lang#84933, which is about improving the situation.
Manishearth added a commit to Manishearth/rust that referenced this issue Nov 22, 2022
Forbid inlining `thread_local!`'s `__getit` function on Windows

Sadly, this will make things slower to avoid UB in an edge case, but it seems hard to avoid... and really whenever I look at this code I can't help but think we're asking for trouble.

It's pretty dodgy for us to leave this as a normal function rather than `#[inline(never)]`, given that if it *does* get inlined into a dynamically linked component, it's extremely unsafe (you get some other thread local, or if you're lucky, crash). Given that it's pretty rare for people to use dylibs on Windows, the fact that we haven't gotten bug reports about it isn't really that convincing. Ideally we'd come up with some kind of compiler solution (that avoids paying for this cost when static linking, or *at least* for use within the same crate...), but it's not clear what that looks like.

Oh, and because all this is only needed when we're implementing `thread_local!` with `#[thread_local]`, this patch adjusts the `cfg_attr` to be `all(windows, target_thread_local)` as well.

r? ```@ChrisDenton```

See also rust-lang#84933, which is about improving the situation.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 8, 2023
Support TLS access into dylibs on Windows

This allows access to `#[thread_local]`  in upstream dylibs on Windows by introducing a MIR shim to return the address of the thread local. Accesses that go into an upstream dylib will call the MIR shim to get the address of it.

`convert_tls_rvalues` is introduced in `rustc_codegen_ssa` which rewrites MIR TLS accesses to dummy calls which are replaced with calls to the MIR shims when the dummy calls are lowered to backend calls.

A new `dll_tls_export` target option enables this behavior with a `false` value which is set for Windows platforms.

This fixes rust-lang#84933.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 9, 2023
Support TLS access into dylibs on Windows

This allows access to `#[thread_local]`  in upstream dylibs on Windows by introducing a MIR shim to return the address of the thread local. Accesses that go into an upstream dylib will call the MIR shim to get the address of it.

`convert_tls_rvalues` is introduced in `rustc_codegen_ssa` which rewrites MIR TLS accesses to dummy calls which are replaced with calls to the MIR shims when the dummy calls are lowered to backend calls.

A new `dll_tls_export` target option enables this behavior with a `false` value which is set for Windows platforms.

This fixes rust-lang#84933.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 10, 2023
Support TLS access into dylibs on Windows

This allows access to `#[thread_local]`  in upstream dylibs on Windows by introducing a MIR shim to return the address of the thread local. Accesses that go into an upstream dylib will call the MIR shim to get the address of it.

`convert_tls_rvalues` is introduced in `rustc_codegen_ssa` which rewrites MIR TLS accesses to dummy calls which are replaced with calls to the MIR shims when the dummy calls are lowered to backend calls.

A new `dll_tls_export` target option enables this behavior with a `false` value which is set for Windows platforms.

This fixes rust-lang#84933.
@bors bors closed this as completed in f98598c Mar 29, 2023
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Jun 1, 2023
Support TLS access into dylibs on Windows

This allows access to `#[thread_local]`  in upstream dylibs on Windows by introducing a MIR shim to return the address of the thread local. Accesses that go into an upstream dylib will call the MIR shim to get the address of it.

`convert_tls_rvalues` is introduced in `rustc_codegen_ssa` which rewrites MIR TLS accesses to dummy calls which are replaced with calls to the MIR shims when the dummy calls are lowered to backend calls.

A new `dll_tls_export` target option enables this behavior with a `false` value which is set for Windows platforms.

This fixes rust-lang/rust#84933.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-thread-locals Area: Thread local storage (TLS) O-windows-msvc Toolchain: MSVC, Operating system: Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants