Introduce deduced parameter attributes, and use them for deducing `readonly` on indirect immutable freeze by-value function parameters. #103172

pcwalton · 2022-10-18T03:00:10Z

Introduce deduced parameter attributes, and use them for deducing readonly on
indirect immutable freeze by-value function parameters.

Right now, rustc only examines function signatures and the platform ABI when
determining the LLVM attributes to apply to parameters. This results in missed
optimizations, because there are some attributes that can be determined via
analysis of the MIR making up the function body. In particular, readonly
could be applied to most indirectly-passed by-value function arguments
(specifically, those that are freeze and are observed not to be mutated), but
it currently is not.

This patch introduces the machinery that allows rustc to determine those
attributes. It consists of a query, deduced_param_attrs, that, when
evaluated, analyzes the MIR of the function to determine supplementary
attributes. The results of this query for each function are written into the
crate metadata so that the deduced parameter attributes can be applied to
cross-crate functions. In this patch, we simply check the parameter for
mutations to determine whether the readonly attribute should be applied to
parameters that are indirect immutable freeze by-value. More attributes could
conceivably be deduced in the future: nocapture and noalias come to mind.

Adding readonly to indirect function parameters where applicable enables some
potential optimizations in LLVM that are discussed in issue 103103 and PR
103070 around avoiding stack-to-stack memory copies that appear in functions
like core::fmt::Write::write_fmt and core::panicking::assert_failed. These
functions pass a large structure unchanged by value to a subfunction that also
doesn't mutate it. Since the structure in this case is passed as an indirect
parameter, it's a pointer from LLVM's perspective. As a result, the
intermediate copy of the structure that our codegen emits could be optimized
away by LLVM's MemCpyOptimizer if it knew that the pointer is readonly nocapture noalias in both the caller and callee. We already pass nocapture noalias, but we're missing readonly, as we can't determine whether a
by-value parameter is mutated by examining the signature in Rust. I didn't have
much success with having LLVM infer the readonly attribute, even with fat
LTO; it seems that deducing it at the MIR level is necessary.

No large benefits should be expected from this optimization now; LLVM needs
some changes (discussed in PR 103070) to more aggressively use the noalias nocapture readonly combination in its alias analysis. I have some LLVM patches
for these optimizations and have had them looked over. With all the patches
applied locally, I enabled LLVM to remove all the memcpys from the following
code:

fn main() {
    println!("Hello {}", 3);
}

which is a significant codegen improvement over the status quo. I expect that if this optimization kicks in in multiple places even for such a simple program, then it will apply to Rust code all over the place.

rustbot · 2022-10-18T03:00:14Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

rust-highfive · 2022-10-18T03:00:14Z

r? @compiler-errors

(rust-highfive has picked a reviewer for you, use r? to override)

compiler/rustc_mir_transform/src/lib.rs

pcwalton · 2022-10-18T04:41:16Z

The patch is updated to use the Visitor to detect mutations of parameters. I'll mark it as non-draft if there are no more comments once the tests pass locally.

pcwalton · 2022-10-18T05:42:51Z

This seems ready.

Those two failures confuse me—I don't mutate the MIR at all, and these aren't codegen tests…

compiler/rustc_mir_transform/src/deduce_param_attrs.rs

oli-obk · 2022-10-18T07:33:18Z

@bors try @rust-timer queue

rust-timer · 2022-10-18T07:33:19Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-10-18T07:33:27Z

⌛ Trying commit bf18d564e3d54e08ba4372e1d4b20ef1b6e3afaa with merge 9077e397fbfc2a1a5945228575b6ce77985fd1f8...

compiler/rustc_mir_transform/src/deduce_param_attrs.rs

compiler/rustc_ty_utils/src/abi.rs

compiler/rustc_mir_transform/src/deduce_param_attrs.rs

pcwalton · 2022-10-18T09:02:04Z

Updated the PR to address comments. I added a new test to ensure that we don't mark non-freeze types as readonly. I also added a comment explaining why I don't think that the fact that moves semantically store undef to the moved-from value invalidates the optimization.

bors · 2022-10-21T06:06:19Z

☀️ Try build successful - checks-actions
Build commit: a5cf94e7f6c6d3272682f3eeeb831ec529decd2f (a5cf94e7f6c6d3272682f3eeeb831ec529decd2f)

rust-timer · 2022-10-21T06:06:21Z

Queued a5cf94e7f6c6d3272682f3eeeb831ec529decd2f with parent dcb3761, future comparison URL.

rust-timer · 2022-10-21T07:24:10Z

Finished benchmarking commit (a5cf94e7f6c6d3272682f3eeeb831ec529decd2f): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	0.3%	[0.3%, 0.3%]	1
Regressions ❌ (secondary)	0.5%	[0.1%, 1.4%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	3
All ❌✅ (primary)	0.3%	[0.3%, 0.3%]	1

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

the arithmetic mean of the percent change ↩
number of relevant changes ↩

oli-obk · 2022-10-21T07:39:08Z

@bors delegate+

code and perf lgtm now. r=me with commits squashed

bors · 2022-10-21T07:39:10Z

✌️ @pcwalton can now approve this pull request

…adonly` on indirect immutable freeze by-value function parameters. Right now, `rustc` only examines function signatures and the platform ABI when determining the LLVM attributes to apply to parameters. This results in missed optimizations, because there are some attributes that can be determined via analysis of the MIR making up the function body. In particular, `readonly` could be applied to most indirectly-passed by-value function arguments (specifically, those that are freeze and are observed not to be mutated), but it currently is not. This patch introduces the machinery that allows `rustc` to determine those attributes. It consists of a query, `deduced_param_attrs`, that, when evaluated, analyzes the MIR of the function to determine supplementary attributes. The results of this query for each function are written into the crate metadata so that the deduced parameter attributes can be applied to cross-crate functions. In this patch, we simply check the parameter for mutations to determine whether the `readonly` attribute should be applied to parameters that are indirect immutable freeze by-value. More attributes could conceivably be deduced in the future: `nocapture` and `noalias` come to mind. Adding `readonly` to indirect function parameters where applicable enables some potential optimizations in LLVM that are discussed in [issue 103103] and [PR 103070] around avoiding stack-to-stack memory copies that appear in functions like `core::fmt::Write::write_fmt` and `core::panicking::assert_failed`. These functions pass a large structure unchanged by value to a subfunction that also doesn't mutate it. Since the structure in this case is passed as an indirect parameter, it's a pointer from LLVM's perspective. As a result, the intermediate copy of the structure that our codegen emits could be optimized away by LLVM's MemCpyOptimizer if it knew that the pointer is `readonly nocapture noalias` in both the caller and callee. We already pass `nocapture noalias`, but we're missing `readonly`, as we can't determine whether a by-value parameter is mutated by examining the signature in Rust. I didn't have much success with having LLVM infer the `readonly` attribute, even with fat LTO; it seems that deducing it at the MIR level is necessary. No large benefits should be expected from this optimization *now*; LLVM needs some changes (discussed in [PR 103070]) to more aggressively use the `noalias nocapture readonly` combination in its alias analysis. I have some LLVM patches for these optimizations and have had them looked over. With all the patches applied locally, I enabled LLVM to remove all the `memcpy`s from the following code: ```rust fn main() { println!("Hello {}", 3); } ``` which is a significant codegen improvement over the status quo. I expect that if this optimization kicks in in multiple places even for such a simple program, then it will apply to Rust code all over the place. [issue 103103]: rust-lang#103103 [PR 103070]: rust-lang#103070

pcwalton · 2022-10-21T09:35:36Z

@bors: r=oli-obk

@rustbot label: +perf-regression-triaged These are minor inconsistent performance changes in service of better optimization once LLVM alias analysis improves.

rustbot · 2022-10-21T09:35:38Z

Error: Label These can only be set by Rust team members

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

bors · 2022-10-21T09:35:38Z

📌 Commit da630ac has been approved by oli-obk

It is now in the queue for this repository.

bors · 2022-10-22T02:28:08Z

⌛ Testing commit da630ac with merge eecde58...

Aaron1011 · 2022-10-22T05:06:49Z

cc @rust-lang/wg-unsafe-code-guidelines

From my understanding, this optimization won't change the behavior of any sound programs. If we create a pointer/reference to a function argument (e.g. fn foo(mut arg: SomeType) { let ptr = &mut arg }), then the argument will be considered 'mutable', and this optimization will be skipped. The only way for a program to modify such an argument would be to perform an out-of-bounds write on a different mutable pointer (which happens to have the same address as the argument storage). This is already undefined behavior.

However, I haven't seen any mention of unsafe in the PR discussion, so I thought it would to get confirmation about my reasoning from more knowledgeable people.

bors · 2022-10-22T05:08:45Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing eecde58 to master...

bors · 2022-10-22T05:08:45Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing eecde58 to master...

rust-timer · 2022-10-22T06:25:37Z

Finished benchmarking commit (eecde58): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.3%	[1.3%, 1.3%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	4
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.2%	[4.2%, 4.2%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.6%	[-5.0%, -4.3%]	2
All ❌✅ (primary)	-	-	0

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

RalfJung · 2022-10-22T10:20:54Z

@Aaron1011 is there a summary of what happens here for someone who doesn't live and breathe LLVM IR?^^ My question is basically the same as in #103103: What do requirements do we need to impose on the MIR level to justify this attribute?

"indirect immutable freeze by-value function parameter" is using a lot of terms that don't exist in MIR so I don't understand what this means. How can a parameter be both indirect and by-value?!?

RalfJung · 2022-10-22T10:23:17Z

compiler/rustc_mir_transform/src/deduce_param_attrs.rs

+ PlaceContext::MutatingUse(..)
+ | PlaceContext::NonMutatingUse(NonMutatingUseContext::Move) => {
+ // This is a mutation, so mark it as such.
+ self.mutable_args.insert(local.index() - 1);


NonMutatingUseContext::Move is a mutation? Looks like naming went wrong somewhere...?

It's not a mutation for borrowck purposes, since you don't need let mut to move out. The opsem might disagree with that, but we should decide the opsem first and then consider renaming something here

RalfJung · 2022-10-22T10:24:11Z

If we create a pointer/reference to a function argument (e.g. fn foo(mut arg: SomeType) { let ptr = &mut arg }), then the argument will be considered 'mutable', and this optimization will be skipped.

What is &, addr_of, or addr_of_mut are being used?

JakobDegen · 2022-10-23T00:06:14Z

@RalfJung as far as I can tell, this only affects byval parameters in LLVM, which are used in some cases but only ever for parameters that are passed by value in MIR. As such, I don't think this imposes any additional restrictions on MIR.

rust-highfive assigned compiler-errors Oct 18, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 18, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 18, 2022

pcwalton force-pushed the deduced-param-attrs branch 5 times, most recently from 9cdbc31 to 2d670be Compare October 18, 2022 03:20

JakobDegen reviewed Oct 18, 2022

View reviewed changes

compiler/rustc_mir_transform/src/lib.rs Outdated Show resolved Hide resolved

pcwalton marked this pull request as draft October 18, 2022 03:28

This comment has been minimized.

Sign in to view

pcwalton force-pushed the deduced-param-attrs branch from 2d670be to 9603c77 Compare October 18, 2022 04:40

pcwalton force-pushed the deduced-param-attrs branch from 9603c77 to 0e8a4e6 Compare October 18, 2022 04:50

This comment has been minimized.

Sign in to view

pcwalton marked this pull request as ready for review October 18, 2022 05:42

pcwalton force-pushed the deduced-param-attrs branch from 0e8a4e6 to bf18d56 Compare October 18, 2022 06:06

This comment has been minimized.

Sign in to view

tmiasko reviewed Oct 18, 2022

View reviewed changes

compiler/rustc_mir_transform/src/deduce_param_attrs.rs Outdated Show resolved Hide resolved

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 18, 2022

oli-obk requested changes Oct 18, 2022

View reviewed changes

pcwalton force-pushed the deduced-param-attrs branch from bf18d56 to 11fc0a7 Compare October 18, 2022 09:00

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 21, 2022

pcwalton force-pushed the deduced-param-attrs branch from e4e37f0 to da630ac Compare October 21, 2022 09:34

bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 22, 2022

bors merged commit eecde58 into rust-lang:master Oct 22, 2022

rustbot added this to the 1.66.0 milestone Oct 22, 2022

This was referenced Oct 22, 2022

Separate generator info from MIR body. #101547

Closed

Compute generator saved locals on MIR #101692

Merged

rustbot removed the perf-regression Performance regression. label Oct 22, 2022

RalfJung reviewed Oct 22, 2022

View reviewed changes

Noratrieb mentioned this pull request Nov 22, 2022

Out of storage use of local for temporary caused by label break #104736

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce deduced parameter attributes, and use them for deducing `readonly` on indirect immutable freeze by-value function parameters. #103172

Introduce deduced parameter attributes, and use them for deducing `readonly` on indirect immutable freeze by-value function parameters. #103172

pcwalton commented Oct 18, 2022 •

edited

Loading

rustbot commented Oct 18, 2022

rust-highfive commented Oct 18, 2022

This comment has been minimized.

pcwalton commented Oct 18, 2022

This comment has been minimized.

pcwalton commented Oct 18, 2022 •

edited

Loading

This comment has been minimized.

oli-obk commented Oct 18, 2022

rust-timer commented Oct 18, 2022

bors commented Oct 18, 2022

pcwalton commented Oct 18, 2022

bors commented Oct 21, 2022

rust-timer commented Oct 21, 2022

rust-timer commented Oct 21, 2022

oli-obk commented Oct 21, 2022

bors commented Oct 21, 2022

pcwalton commented Oct 21, 2022

rustbot commented Oct 21, 2022

bors commented Oct 21, 2022

bors commented Oct 22, 2022

Aaron1011 commented Oct 22, 2022 •

edited

Loading

bors commented Oct 22, 2022

bors commented Oct 22, 2022

rust-timer commented Oct 22, 2022

RalfJung commented Oct 22, 2022 •

edited

Loading

RalfJung Oct 22, 2022

JakobDegen Oct 23, 2022

RalfJung commented Oct 22, 2022

JakobDegen commented Oct 23, 2022

Introduce deduced parameter attributes, and use them for deducing readonly on indirect immutable freeze by-value function parameters. #103172

Introduce deduced parameter attributes, and use them for deducing readonly on indirect immutable freeze by-value function parameters. #103172

Conversation

pcwalton commented Oct 18, 2022 • edited Loading

rustbot commented Oct 18, 2022

rust-highfive commented Oct 18, 2022

This comment has been minimized.

pcwalton commented Oct 18, 2022

This comment has been minimized.

pcwalton commented Oct 18, 2022 • edited Loading

This comment has been minimized.

oli-obk commented Oct 18, 2022

rust-timer commented Oct 18, 2022

bors commented Oct 18, 2022

pcwalton commented Oct 18, 2022

bors commented Oct 21, 2022

rust-timer commented Oct 21, 2022

rust-timer commented Oct 21, 2022

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Footnotes

oli-obk commented Oct 21, 2022

bors commented Oct 21, 2022

pcwalton commented Oct 21, 2022

rustbot commented Oct 21, 2022

bors commented Oct 21, 2022

bors commented Oct 22, 2022

Aaron1011 commented Oct 22, 2022 • edited Loading

bors commented Oct 22, 2022

bors commented Oct 22, 2022

rust-timer commented Oct 22, 2022

Overall result: ❌✅ regressions and improvements - no action needed

Footnotes

RalfJung commented Oct 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung commented Oct 22, 2022

JakobDegen commented Oct 23, 2022

Introduce deduced parameter attributes, and use them for deducing `readonly` on indirect immutable freeze by-value function parameters. #103172

Introduce deduced parameter attributes, and use them for deducing `readonly` on indirect immutable freeze by-value function parameters. #103172

pcwalton commented Oct 18, 2022 •

edited

Loading

pcwalton commented Oct 18, 2022 •

edited

Loading

Aaron1011 commented Oct 22, 2022 •

edited

Loading

RalfJung commented Oct 22, 2022 •

edited

Loading