Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimized lock methods for Sharded and refactor Lock #115388

Merged
merged 3 commits into from
Sep 11, 2023

Conversation

Zoxc
Copy link
Contributor

@Zoxc Zoxc commented Aug 30, 2023

This adds methods to Sharded which pick a shard and also locks it. These branch on parallelism just once instead of twice, improving performance.

Benchmark for cfg(parallel_compiler) and 1 thread:

BenchmarkBeforeAfter
TimeTime%
🟣 clap:check1.6461s1.6345s -0.70%
🟣 hyper:check0.2414s0.2394s -0.83%
🟣 regex:check0.9205s0.9143s -0.67%
🟣 syn:check1.4981s1.4869s -0.75%
🟣 syntex_syntax:check5.7629s5.7256s -0.65%
Total10.0690s10.0008s -0.68%
Summary1.0000s0.9928s -0.72%

cc @SparrowLii

@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Aug 30, 2023

r? @compiler-errors

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 30, 2023
@rust-log-analyzer

This comment has been minimized.

@SparrowLii
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 31, 2023
@bors
Copy link
Contributor

bors commented Aug 31, 2023

⌛ Trying commit 73917dd4206ccbacccdc201d529561ce5bd9055f with merge 24259321f2e7a82959b47b86ded3d1073f281746...

compiler/rustc_data_structures/src/sharded.rs Show resolved Hide resolved
compiler/rustc_data_structures/src/sharded.rs Show resolved Hide resolved
compiler/rustc_data_structures/src/sync/lock.rs Outdated Show resolved Hide resolved
/// Safety
/// This method must only be called if `might_be_dyn_thread_safe` was true on lock creation.
#[inline(always)]
unsafe fn lock_assume_sync(&self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to add several unsafe functions? We can just do this under Lock's method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This keeps the code non-generic and it also makes LockRaw more fully featured.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's worth adding extra unsafe functions, after all, both are to reduce maintenance costs.
cc @compiler-errors

@bors
Copy link
Contributor

bors commented Aug 31, 2023

☀️ Try build successful - checks-actions
Build commit: 24259321f2e7a82959b47b86ded3d1073f281746 (24259321f2e7a82959b47b86ded3d1073f281746)

@rust-timer

This comment has been minimized.

@klensy
Copy link
Contributor

klensy commented Aug 31, 2023

Well, regressed heavily for big crates, as of now: https://perf.rust-lang.org/status.html (Or CI feeling bad itself)

Step   Took Expected
await-call-tree   0m28s 0m28s
bitmaps-3.1.0   1m06s 0m59s
cargo-0.60.0   8m24s 5m23s
clap-3.1.6   1m43s 1m32s
coercions   0m56s 0m54s
cranelift-codegen-0.82.1   9m50s 3m02s

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (24259321f2e7a82959b47b86ded3d1073f281746): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.4% [-0.4%, -0.4%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.2% [2.2%, 2.2%] 1
Improvements ✅
(primary)
-0.8% [-1.1%, -0.6%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.8% [-1.1%, -0.6%] 3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.5% [2.0%, 3.5%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 631.655s -> 631.389s (-0.04%)
Artifact size: 316.64 MiB -> 316.65 MiB (0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 31, 2023
@klensy
Copy link
Contributor

klensy commented Aug 31, 2023

Well, regressed heavily for big crates, as of now: https://perf.rust-lang.org/status.html (Or CI feeling bad itself)
Step Took Expected
await-call-tree 0m28s 0m28s
bitmaps-3.1.0 1m06s 0m59s
cargo-0.60.0 8m24s 5m23s
clap-3.1.6 1m43s 1m32s
coercions 0m56s 0m54s
cranelift-codegen-0.82.1 9m50s 3m02s

Perf looks neutral. Why in that case took time differs so much for some benches? cranelift x3, for example.

@klensy
Copy link
Contributor

klensy commented Aug 31, 2023

And in next perf run time returned back, sus:

Currently benchmarking: 6ff94474e1d11.
Time left: 9m56s

Step   Took Expected
await-call-tree   0m30s 0m28s
bitmaps-3.1.0   1m02s 1m06s
cargo-0.60.0   5m26s 8m24s
clap-3.1.6   1m32s 1m43s
coercions   0m55s 0m56s
cranelift-codegen-0.82.1   3m04s 9m50s

@Zoxc Zoxc force-pushed the sharded-lock branch 2 times, most recently from bfcd7a1 to d500310 Compare September 3, 2023 01:30
@Zoxc Zoxc changed the title Add optimized lock methods for Sharded Add optimized lock methods for Sharded and refactor Lock Sep 3, 2023
@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 3, 2023

This now includes a refactored Lock implementation that removes RawLock, uses enums and works with track_caller.

@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 3, 2023

Up to date benchmark for cfg(parallel_compiler) and 1 thread:

BenchmarkBeforeAfter
TimeTime%
🟣 clap:check1.6611s1.6510s -0.61%
🟣 hyper:check0.2533s0.2516s -0.65%
🟣 regex:check0.9303s0.9228s -0.81%
🟣 syn:check1.5010s1.4892s -0.78%
🟣 syntex_syntax:check5.7691s5.7315s -0.65%
Total10.1147s10.0461s -0.68%
Summary1.0000s0.9930s -0.70%

@SparrowLii
Copy link
Member

SparrowLii commented Sep 5, 2023

I think the new commit follow the discussion about split impl of Lock into two mods.

This looks good to me. @nnethercote Can you have a look?

@rust-log-analyzer

This comment has been minimized.

@SparrowLii
Copy link
Member

SparrowLii commented Sep 8, 2023

Thanks! Let's run a perf again for confirm
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 8, 2023
@bors
Copy link
Contributor

bors commented Sep 8, 2023

⌛ Trying commit 9690142 with merge 1f36988...

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 8, 2023
Add optimized lock methods for `Sharded` and refactor `Lock`

This adds methods to `Sharded` which pick a shard and also locks it. These branch on parallelism just once instead of twice, improving performance.

Benchmark for `cfg(parallel_compiler)` and 1 thread:
<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.6461s</td><td align="right">1.6345s</td><td align="right"> -0.70%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2414s</td><td align="right">0.2394s</td><td align="right"> -0.83%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.9205s</td><td align="right">0.9143s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.4981s</td><td align="right">1.4869s</td><td align="right"> -0.75%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">5.7629s</td><td align="right">5.7256s</td><td align="right"> -0.65%</td></tr><tr><td>Total</td><td align="right">10.0690s</td><td align="right">10.0008s</td><td align="right"> -0.68%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9928s</td><td align="right"> -0.72%</td></tr></table>

cc `@SparrowLii`
@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Sep 8, 2023

☀️ Try build successful - checks-actions
Build commit: 1f36988 (1f36988828d2c6b2475df97d8de0e86ff7a4d9b5)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1f36988): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.7% [0.7%, 0.7%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.3% [-3.3%, -3.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.3% [-3.3%, 0.7%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.0% [1.0%, 1.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 628.475s -> 628.149s (-0.05%)
Artifact size: 318.12 MiB -> 318.16 MiB (0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 8, 2023
@SparrowLii
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Sep 11, 2023

📌 Commit 9690142 has been approved by SparrowLii

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 11, 2023
@bors
Copy link
Contributor

bors commented Sep 11, 2023

⌛ Testing commit 9690142 with merge 9b72cc9...

@bors
Copy link
Contributor

bors commented Sep 11, 2023

☀️ Test successful - checks-actions
Approved by: SparrowLii
Pushing 9b72cc9 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 11, 2023
@bors bors merged commit 9b72cc9 into rust-lang:master Sep 11, 2023
12 checks passed
@rustbot rustbot added this to the 1.74.0 milestone Sep 11, 2023
@Zoxc Zoxc deleted the sharded-lock branch September 11, 2023 04:06
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9b72cc9): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.7% [2.7%, 2.7%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 631.455s -> 631.227s (-0.04%)
Artifact size: 317.62 MiB -> 317.64 MiB (0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants