Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce DynSend and DynSync auto trait for parallel compiler #107586

Merged
merged 7 commits into from
May 13, 2023

Conversation

SparrowLii
Copy link
Member

@SparrowLii SparrowLii commented Feb 2, 2023

part of parallel-rustc #101566

This PR introduces DynSend / DynSync trait and FromDyn / IntoDyn structure in rustc_data_structure::marker. FromDyn can dynamically check data structures for thread safety when switching to parallel environments (such as calling par_for_each_in). This happens only when -Z threads > 1 so it doesn't affect single-threaded mode's compile efficiency.

r? @cjgillot

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 2, 2023
@SparrowLii
Copy link
Member Author

@@ -952,6 +958,151 @@ fn analysis(tcx: TyCtxt<'_>, (): ()) -> Result<()> {
Ok(())
}

fn non_par_analysis(tcx: TyCtxt<'_>) -> Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code duplication is really unfortunate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Haven't thought of an elegant way to write it yet, but I'll fix that soon

// runtime whether these non-shared data structures actually exist.
unsafe impl<'tcx> DynSendSyncCheck for TyCtxt<'tcx> {
#[inline]
fn check_send_sync(&self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use let GlobalCtxt { a, b, c } = self for exhaustiveness checking?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it makes sense!

Comment on lines 40 to 92
// Only set by the `-Z threads` compile option
pub unsafe fn set_parallel() {
let p = SyncUnsafeCell::raw_get(&PARALLEL as *const _);
*p = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, it would be great to have a doc comment here, especially given that this is an unsafe function. Second of all, at first glance it seems like this can be more simply written as *PARALLEL.get() = true, am I missing something? Lastly, is is_parallel hot? Can we use an AtomicUsize instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it's a little hot, it's unlikely that an atomic integer will have a performance impact, since this is just reading from it.

Copy link
Member Author

@SparrowLii SparrowLii Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! PARALLEL will only be set once, so I want to take advantage of this to minimize the cost of reading it. with_context_opt might be hot, but I doubt the necessary to check thread safety here. Except this I think is_parallel() is not hot, since it is only used in relatively top-level logic to determine whether to switch to a parallel environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should first benchmark it before going for the more unsafe variant. Atomics have no to minimal overhead depending on the exact use and ordering (which I think can be relaxed here because we don't need to sync any other writes?).

Copy link
Member Author

@SparrowLii SparrowLii Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I changed to AtomicBool instead. Can you help run a perf?Thanks!
I think we can just use Relaxed, yea

@Noratrieb
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 2, 2023
@bors
Copy link
Contributor

bors commented Feb 2, 2023

⌛ Trying commit 4086ebae9c6b917e51b0f314c18a3dfd032d0e14 with merge 62ba597a41741055fcf131dcee8b691cc9445515...

@bors
Copy link
Contributor

bors commented Feb 2, 2023

☀️ Try build successful - checks-actions
Build commit: 62ba597a41741055fcf131dcee8b691cc9445515 (62ba597a41741055fcf131dcee8b691cc9445515)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (62ba597a41741055fcf131dcee8b691cc9445515): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.3%, 1.2%] 134
Regressions ❌
(secondary)
0.7% [0.1%, 1.9%] 77
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.6% [0.3%, 1.2%] 134

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.4% [2.2%, 2.7%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.5% [-2.8%, -1.6%] 4
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [0.7%, 1.3%] 7
Regressions ❌
(secondary)
2.2% [1.4%, 2.8%] 22
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.0% [0.7%, 1.3%] 7

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 2, 2023
@SparrowLii
Copy link
Member Author

SparrowLii commented Feb 2, 2023

I think there may be two reasons for regression :
1 multiple if let Some(shared_data) = shared_data.as_ref() in graph.rs.
2 is_parallel() was hot and inefficient.
I'll try to fix them tomorrow

@Zoxc
Copy link
Contributor

Zoxc commented Feb 2, 2023

I've been thinking over how to best approach specialization. I think the dynamic dispatch entry point to rustc_query_impl would be a good place to branch. Using proof objects instead of GAT seems more flexible, at least for locks. I can write up some details on those later.

It seems like a good idea to land locks with a runtime switch first so there is an optimized baseline to compare with specialization. I suggest finishing my branch by extracting just the lock implementation and moving it to a new lock module under sync. You can also add a mode module with a global atomic with 3 states (uninit, on, off). Use a compare and swap to ensure it can only move from uninit to one of the other states. I'm not quite sure what's going on with the DynSendSyncCheck trait, but the manual listing of fields is a bit awkward. We can however literally copy Send and Sync from the standard library and I'd suggest doing so, placing them in a marker module under sync with a rename.

@SparrowLii
Copy link
Member Author

SparrowLii commented Feb 3, 2023

In my local test, is_parallel() is to be the main cause of regression. After I changed it to const fn and always return false, the regression was not visible. In addition, the performance of AtomicBool, SyncUnsafeCell, and static mut is not much different.

Also as I guessed, with_context_opt is hot, calling is_parallel() in it looks like the main reason.

@SparrowLii
Copy link
Member Author

Can we run another perf? Thanks!

@Noratrieb
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 3, 2023
@bors
Copy link
Contributor

bors commented Feb 3, 2023

⌛ Trying commit b2d7910a011fe52c5ab434c695a5caf1a5602725 with merge 2949dcde96d9502e79a5af27f252db8c97e8533e...

@bors
Copy link
Contributor

bors commented Feb 3, 2023

☀️ Try build successful - checks-actions
Build commit: 2949dcde96d9502e79a5af27f252db8c97e8533e (2949dcde96d9502e79a5af27f252db8c97e8533e)

@rust-timer

This comment has been minimized.

@SparrowLii SparrowLii force-pushed the parallel-query branch 3 times, most recently from 25b4906 to d8ca8e6 Compare April 10, 2023 01:54
@bors
Copy link
Contributor

bors commented Apr 18, 2023

☔ The latest upstream changes (presumably #110243) made this pull request unmergeable. Please resolve the merge conflicts.

@apiraino
Copy link
Contributor

apiraino commented May 3, 2023

@SparrowLii if I read the comment thread correctly @cjgillot should approve this PR after one last rebase? Thanks!

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 3, 2023
@SparrowLii
Copy link
Member Author

SparrowLii commented May 6, 2023

@cjgillot Can it be merged now? : ) I don't have privileges so I need your help

@cjgillot cjgillot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 7, 2023
@cjgillot
Copy link
Contributor

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented May 13, 2023

📌 Commit d7e3e5b has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 13, 2023
@bors
Copy link
Contributor

bors commented May 13, 2023

⌛ Testing commit d7e3e5b with merge dd8ec9c...

@bors
Copy link
Contributor

bors commented May 13, 2023

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing dd8ec9c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 13, 2023
@bors bors merged commit dd8ec9c into rust-lang:master May 13, 2023
@rustbot rustbot added this to the 1.71.0 milestone May 13, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (dd8ec9c): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.0% [-2.0%, -2.0%] 1
Improvements ✅
(secondary)
-1.4% [-1.4%, -1.4%] 1
All ❌✅ (primary) -2.0% [-2.0%, -2.0%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 660.401s -> 660.339s (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) A-translation Area: Translation infrastructure, and migrating existing diagnostics to SessionDiagnostic merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.