Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mutex.cpp, cond.cpp: Use static dispatch #3770

Merged
merged 13 commits into from
Jun 22, 2023

Conversation

cpplearner
Copy link
Contributor

@cpplearner cpplearner commented Jun 14, 2023

After #2317 removed stl_critical_section_vista and stl_condition_variable_vista, only the Win7 versions inherit from the corresponding interfaces. Thus the functions in mutex.cpp and cond.cpp could directly use the Win7 classes, without going through the interfaces. This improves performance, and makes it easier to implement constexpr mutex.

The first commit (24922f3) changes the getters in _Mtx_internal_imp_t and _Cnd_internal_imp_t to return the Win7 classes. Since these classes are final, apparently the compiler knows to use static dispatch for them.

Subsequent commits change create_stl_critical_section and create_stl_condition_variable to create non-polymorphic classes, with the "vptr" set to nullptr.

I can imagine that these cause trouble if a mutex object is created by new code and used by old code, and old code uses the old function definitions (because it links to an old static lib?). IIUC this scenario is unsupported, and it will be a noticeable breakage due to null dereference.

@cpplearner cpplearner requested a review from a team as a code owner June 14, 2023 07:31
@strega-nil-ms

This comment was marked as resolved.

@StephanTLavavej StephanTLavavej added the performance Must go faster label Jun 14, 2023
@StephanTLavavej

This comment was marked as resolved.

@StephanTLavavej

This comment was marked as resolved.

@StephanTLavavej StephanTLavavej self-assigned this Jun 15, 2023
@strega-nil-ms

This comment was marked as resolved.

@azure-pipelines

This comment was marked as resolved.

…iable`.

We never provided user-visible declarations for what they returned. It's extremely unlikely that users were successfully calling them in any useful way, so the bincompat risk of nulling out the vptrs is minimal.

I am opting to `=delete` these functions, instead of simply not declaring them, so that attempted calls will have a slightly more comprehensible error message. As the TRANSITION comments note, in theory we could provide implementations that return the underlying Win32 machinery which actually would be possible to document.

I am also changing `condition_variable`'s `native_handle_type` from `_Cnd_t` (an alias provided by `<xthreads.h>` for `_Cnd_internal_imp_t*`, which is forward-declared and never defined for users) to `void*` and grouping it with the function, which follows the Standard's order.
stl/src/primitives.hpp Outdated Show resolved Hide resolved
stl/src/primitives.hpp Outdated Show resolved Hide resolved
stl/src/primitives.hpp Outdated Show resolved Hide resolved
stl/src/primitives.hpp Outdated Show resolved Hide resolved
stl/src/primitives.hpp Outdated Show resolved Hide resolved
stl/inc/mutex Outdated
@@ -95,9 +95,8 @@ public:

using native_handle_type = void*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still declare the native_handle_type even if native_handle is not usable? I guess this is technically conforming, but looks weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's an open question.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[thread.req.native]/1:

Several classes described in this Clause have members native_handle_type and native_handle. The presence of these members and their semantics is implementation-defined."

We can do whatever we like here, if we document it. I'd prefer for native_handle_type to be undefined or denote void, but we can debate it after merging this change.

Whatever we do, we need to update https://learn.microsoft.com/en-us/cpp/standard-library/mutex-class-stl?view=msvc-170#native_handle.

@AlexGuteniev
Copy link
Contributor

makes it easier to implement constexpr mutex.

I don't see it.
We need to support old DLL version, which will want vtable.

(I previously attempted a PR with fake vtable instead, that was rejected)

@cpplearner
Copy link
Contributor Author

makes it easier to implement constexpr mutex.

I don't see it. We need to support old DLL version, which will want vtable.

(I previously attempted a PR with fake vtable instead, that was rejected)

Yes, new code will not work with old DLL/LIB, but IIUC that's unsupported anyway.

@AlexGuteniev
Copy link
Contributor

Yes, new code will not work with old DLL/LIB, but IIUC that's unsupported anyway.

I believe it is unsupported for the LIB, but for the DLL it is supported due to "printer driver problem"

@strega-nil-ms
Copy link
Contributor

makes it easier to implement constexpr mutex.

I don't see it. We need to support old DLL version, which will want vtable.

(I previously attempted a PR with fake vtable instead, that was rejected)

@AlexGuteniev makes a very good point. If we're working towards constexpr mutex, we would need to allow the user code to create a stl_critical_section_win7 with the correct vtable, rather than no vtable. It should theoretically be constexpr to do this if we declare stl_critical_section_win7 in a public header (although suitably uglified).

@AlexGuteniev
Copy link
Contributor

It should theoretically be constexpr to do this if we declare stl_critical_section_win7 in a public header (although suitably uglified).

Let's work towards this then!

And then we won't even need manual devirtualization as in this PR: due to final keyword the compiler would devirtualize calls.

https://godbolt.org/z/6PM87qvYP

@AlexGuteniev
Copy link
Contributor

It should theoretically be constexpr to do this if we declare stl_critical_section_win7 in a public header (although suitably uglified).

Let's work towards this then!

And then we won't even need manual devirtualization as in this PR: due to final keyword the compiler would devirtualize calls.

https://godbolt.org/z/6PM87qvYP

Oh, forgot about [thread.mutex.class]/3:

[...] It is a standard-layout class ([class.prop]).

It can't be made constexpr I thnk.

@strega-nil-ms
Copy link
Contributor

strega-nil-ms commented Jun 20, 2023

@AlexGuteniev

ah. That is unfortunate.

edit: given that we never call typeid, we could theoretically just create a vtable manually :>

@StephanTLavavej
Copy link
Member

Actually, I believe I have convinced myself that we don't even need to retain the void* unused null placeholder for the vptr. The headers don't actually care how much of the aligned storage space is used, and (aside from native_handle() which I assert no user could feasibly have been calling), nothing cares about the internal layout.

I was led to this by thinking about whether stl_condition_variable_win7 needed the vptr, given that condition variables have no mirrored _Count to worry about. Then I realized that _Mtx_internal_imp_mirror / _Mtx_internal_imp_t separate the aligned storage (where the stl_critical_section_win7 lives) from the _Count / count.

To repeat/clarify, because ABI issues are unfamiliar to reason about, there is only one copy of the STL's DLL or static LIB that we ever need to worry about. (It is possible for multiple user DLLs to be loaded into a single process with separate STL DLLs or statically linked LIBs, but they cannot communicate CRT/STL objects between themselves.) So, the STL's DLL/LIB is always internally consistent and we don't need to worry about mix-and-match the way that we do for headers built by users. The flat C API (with the _Count and native_handle() "punch through" exceptions as noted) ensures that the user-built headers are insulated from what's happening in the STL's DLL/LIB. As long as native_handle() is unused, and _Count at the mirrored layer outside the aligned storage appears at the expected location, we are fine.

I will push a change. If anyone believes that my analysis is incorrect, I welcome corrections, but please check with me before beginning an "edit war" of commits. 😺

@StephanTLavavej StephanTLavavej removed their assignment Jun 21, 2023
@StephanTLavavej
Copy link
Member

One more thought: for a constexpr mutex followup PR, we could have an escape hatch, so that "print driver" code could give up constexpr mutex and continue calling the separately compiled initialization functions. That would be my preferred path forward, since it would conform by default for the vast majority of code that doesn't care (even among code that takes advantage of bincompat guarantees).

@AlexGuteniev
Copy link
Contributor

I am also not concerned about the dreaded "print driver" scenario that we may or may not actually need to support (it is unclear whether it is actually guaranteed to work according to our documented restrictions on MSDN). Because this isn't adding new separately compiled APIs (which is the sort of thing that would cause immediate failure on load), and instead would only be detectable upon use of native_handle(), I think it falls into the same "effectively unobservable" category.

If we don't call the constructor from DLL, and don't initialize vtable by other means, I'm afraid "print driver" scenario would be observable by crash on trying to lock the mutex.

@AlexGuteniev
Copy link
Contributor

One more thought: for a constexpr mutex followup PR, we could have an escape hatch, so that "print driver" code could give up constexpr mutex and continue calling the separately compiled initialization functions.

My understanding that escape hatch is not good here, unlike, say, atomic alignment escape hatch.
Because the determination whether you affected by "print driver" problem or not is made not when creating app, but by user environment.
Even if technically not claimed supported, I think this scenario should be maximally supported, just because failure to support it would lead to random failures. And in this particular case, when export surface is intact, the failures are more subtle.

@StephanTLavavej
Copy link
Member

I've pushed a revert, to restore the null vptrs (in advance of the weekly maintainer meeting). @strega-nil-ms noted that a guaranteed crash was safer and better than treating data as a vptr.

We will still need to talk about the print driver scenario for a followup constexpr mutex PR - I see your point, @AlexGuteniev, but maximally supporting that scenario is in conflict with many users who have expected and requested constexpr mutex conformance. If the scenario does exist in the wild, and people are using mutex in it, I don't think the failures will be "random" (like race conditions); they would happen immediately upon attempting to use the mutex, as @strega-nil-ms has envisioned by restoring the null vptr.

Copy link
Member

@CaseyCarter CaseyCarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Discord discussion, STL plans to address my comments, so I'll go ahead and approve.

stl/inc/mutex Outdated
@@ -95,9 +95,8 @@ public:

using native_handle_type = void*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[thread.req.native]/1:

Several classes described in this Clause have members native_handle_type and native_handle. The presence of these members and their semantics is implementation-defined."

We can do whatever we like here, if we document it. I'd prefer for native_handle_type to be undefined or denote void, but we can debate it after merging this change.

Whatever we do, we need to update https://learn.microsoft.com/en-us/cpp/standard-library/mutex-class-stl?view=msvc-170#native_handle.

stl/inc/mutex Outdated Show resolved Hide resolved
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit 10f26f0 into microsoft:main Jun 22, 2023
@StephanTLavavej
Copy link
Member

Thanks for this major and long-awaited simplification! 🎉 😻 ✨

@cpplearner cpplearner deleted the static-dispatch branch December 16, 2023 10:05
CaseyCarter added a commit to CaseyCarter/STL that referenced this pull request Feb 13, 2024
By simply removing `~_Mutex_base`. This destructor has been only a debug check since microsoftGH-3770. Losing the debug check is a small price to pay to elide the destructor call, doubly so for static storage duration mutexes that now won't need dynamic initializers to register with `at_exit`.

`_Mtx_destroy` and `_Mtx_destroy_in_situ` are only called by APIs preserved for binary compatibility, so mark them as also "preserved for binary compatibility".
@CaseyCarter CaseyCarter mentioned this pull request Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants