Skip to content

Commit

Permalink
Auto merge of #103172 - pcwalton:deduced-param-attrs, r=oli-obk
Browse files Browse the repository at this point in the history
Introduce deduced parameter attributes, and use them for deducing `readonly` on indirect immutable freeze by-value function parameters.

Introduce deduced parameter attributes, and use them for deducing `readonly` on
indirect immutable freeze by-value function parameters.

Right now, `rustc` only examines function signatures and the platform ABI when
determining the LLVM attributes to apply to parameters. This results in missed
optimizations, because there are some attributes that can be determined via
analysis of the MIR making up the function body. In particular, `readonly`
could be applied to most indirectly-passed by-value function arguments
(specifically, those that are freeze and are observed not to be mutated), but
it currently is not.

This patch introduces the machinery that allows `rustc` to determine those
attributes. It consists of a query, `deduced_param_attrs`, that, when
evaluated, analyzes the MIR of the function to determine supplementary
attributes. The results of this query for each function are written into the
crate metadata so that the deduced parameter attributes can be applied to
cross-crate functions. In this patch, we simply check the parameter for
mutations to determine whether the `readonly` attribute should be applied to
parameters that are indirect immutable freeze by-value.  More attributes could
conceivably be deduced in the future: `nocapture` and `noalias` come to mind.

Adding `readonly` to indirect function parameters where applicable enables some
potential optimizations in LLVM that are discussed in [issue 103103] and [PR
103070] around avoiding stack-to-stack memory copies that appear in functions
like `core::fmt::Write::write_fmt` and `core::panicking::assert_failed`. These
functions pass a large structure unchanged by value to a subfunction that also
doesn't mutate it. Since the structure in this case is passed as an indirect
parameter, it's a pointer from LLVM's perspective. As a result, the
intermediate copy of the structure that our codegen emits could be optimized
away by LLVM's MemCpyOptimizer if it knew that the pointer is `readonly
nocapture noalias` in both the caller and callee. We already pass `nocapture
noalias`, but we're missing `readonly`, as we can't determine whether a
by-value parameter is mutated by examining the signature in Rust. I didn't have
much success with having LLVM infer the `readonly` attribute, even with fat
LTO; it seems that deducing it at the MIR level is necessary.

No large benefits should be expected from this optimization *now*; LLVM needs
some changes (discussed in [PR 103070]) to more aggressively use the `noalias
nocapture readonly` combination in its alias analysis. I have some LLVM patches
for these optimizations and have had them looked over. With all the patches
applied locally, I enabled LLVM to remove all the `memcpy`s from the following
code:

```rust
fn main() {
    println!("Hello {}", 3);
}
```

which is a significant codegen improvement over the status quo. I expect that if this optimization kicks in in multiple places even for such a simple program, then it will apply to Rust code all over the place.

[issue 103103]: #103103

[PR 103070]: #103070
  • Loading branch information
bors committed Oct 22, 2022
2 parents 8f2c56a + da630ac commit eecde58
Show file tree
Hide file tree
Showing 14 changed files with 393 additions and 9 deletions.
1 change: 1 addition & 0 deletions compiler/rustc_metadata/src/rmeta/decoder/cstore_impl.rs
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,7 @@ provide! { tcx, def_id, other, cdata,
fn_arg_names => { table }
generator_kind => { table }
trait_def => { table }
deduced_param_attrs => { table }
collect_trait_impl_trait_tys => {
Ok(cdata
.root
Expand Down
17 changes: 16 additions & 1 deletion compiler/rustc_metadata/src/rmeta/encoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ use rustc_middle::ty::query::Providers;
use rustc_middle::ty::{self, SymbolName, Ty, TyCtxt};
use rustc_middle::util::common::to_readable_str;
use rustc_serialize::{opaque, Decodable, Decoder, Encodable, Encoder};
use rustc_session::config::CrateType;
use rustc_session::config::{CrateType, OptLevel};
use rustc_session::cstore::{ForeignModule, LinkagePreference, NativeLib};
use rustc_span::hygiene::{ExpnIndex, HygieneEncodeContext, MacroKind};
use rustc_span::symbol::{sym, Symbol};
Expand Down Expand Up @@ -1478,6 +1478,21 @@ impl<'a, 'tcx> EncodeContext<'a, 'tcx> {
record!(self.tables.unused_generic_params[def_id.to_def_id()] <- unused);
}
}

// Encode all the deduced parameter attributes for everything that has MIR, even for items
// that can't be inlined. But don't if we aren't optimizing in non-incremental mode, to
// save the query traffic.
if tcx.sess.opts.output_types.should_codegen()
&& tcx.sess.opts.optimize != OptLevel::No
&& tcx.sess.opts.incremental.is_none()
{
for &local_def_id in tcx.mir_keys(()) {
if let DefKind::AssocFn | DefKind::Fn = tcx.def_kind(local_def_id) {
record_array!(self.tables.deduced_param_attrs[local_def_id.to_def_id()] <-
self.tcx.deduced_param_attrs(local_def_id.to_def_id()));
}
}
}
}

fn encode_stability(&mut self, def_id: DefId) {
Expand Down
3 changes: 2 additions & 1 deletion compiler/rustc_metadata/src/rmeta/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ use rustc_middle::mir;
use rustc_middle::ty::fast_reject::SimplifiedType;
use rustc_middle::ty::query::Providers;
use rustc_middle::ty::{self, ReprOptions, Ty};
use rustc_middle::ty::{GeneratorDiagnosticData, ParameterizedOverTcx, TyCtxt};
use rustc_middle::ty::{DeducedParamAttrs, GeneratorDiagnosticData, ParameterizedOverTcx, TyCtxt};
use rustc_serialize::opaque::FileEncoder;
use rustc_session::config::SymbolManglingVersion;
use rustc_session::cstore::{CrateDepKind, ForeignModule, LinkagePreference, NativeLib};
Expand Down Expand Up @@ -402,6 +402,7 @@ define_tables! {
macro_definition: Table<DefIndex, LazyValue<ast::MacArgs>>,
proc_macro: Table<DefIndex, MacroKind>,
module_reexports: Table<DefIndex, LazyArray<ModChild>>,
deduced_param_attrs: Table<DefIndex, LazyArray<DeducedParamAttrs>>,

trait_impl_trait_tys: Table<DefIndex, LazyValue<FxHashMap<DefId, Ty<'static>>>>,
}
Expand Down
5 changes: 5 additions & 0 deletions compiler/rustc_middle/src/query/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2127,4 +2127,9 @@ rustc_queries! {
) -> Result<(), ErrorGuaranteed> {
desc { |tcx| "checking assoc const `{}` has the same type as trait item", tcx.def_path_str(key.0.to_def_id()) }
}

query deduced_param_attrs(def_id: DefId) -> &'tcx [ty::DeducedParamAttrs] {
desc { |tcx| "deducing parameter attributes for {}", tcx.def_path_str(def_id) }
separate_provide_extern
}
}
1 change: 1 addition & 0 deletions compiler/rustc_middle/src/ty/codec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -455,6 +455,7 @@ impl_arena_copy_decoder! {<'tcx>
rustc_span::def_id::DefId,
rustc_span::def_id::LocalDefId,
(rustc_middle::middle::exported_symbols::ExportedSymbol<'tcx>, rustc_middle::middle::exported_symbols::SymbolExportInfo),
ty::DeducedParamAttrs,
}

#[macro_export]
Expand Down
15 changes: 15 additions & 0 deletions compiler/rustc_middle/src/ty/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2954,6 +2954,21 @@ impl<'tcx> TyCtxtAt<'tcx> {
}
}

/// Parameter attributes that can only be determined by examining the body of a function instead
/// of just its signature.
///
/// These can be useful for optimization purposes when a function is directly called. We compute
/// them and store them into the crate metadata so that downstream crates can make use of them.
///
/// Right now, we only have `read_only`, but `no_capture` and `no_alias` might be useful in the
/// future.
#[derive(Clone, Copy, PartialEq, Debug, Default, TyDecodable, TyEncodable, HashStable)]
pub struct DeducedParamAttrs {
/// The parameter is marked immutable in the function and contains no `UnsafeCell` (i.e. its
/// type is freeze).
pub read_only: bool,
}

// We are comparing types with different invariant lifetimes, so `ptr::eq`
// won't work for us.
fn ptr_eq<T, U>(t: *const T, u: *const U) -> bool {
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_middle/src/ty/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ pub use self::consts::{
};
pub use self::context::{
tls, CanonicalUserType, CanonicalUserTypeAnnotation, CanonicalUserTypeAnnotations,
CtxtInterners, DelaySpanBugEmitted, FreeRegionInfo, GeneratorDiagnosticData,
CtxtInterners, DeducedParamAttrs, DelaySpanBugEmitted, FreeRegionInfo, GeneratorDiagnosticData,
GeneratorInteriorTypeCause, GlobalCtxt, Lift, OnDiskCache, TyCtxt, TypeckResults, UserType,
UserTypeAnnotationIndex,
};
Expand Down
1 change: 1 addition & 0 deletions compiler/rustc_middle/src/ty/parameterized.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ trivially_parameterized_over_tcx! {
crate::middle::resolve_lifetime::ObjectLifetimeDefault,
crate::mir::ConstQualifs,
ty::AssocItemContainer,
ty::DeducedParamAttrs,
ty::Generics,
ty::ImplPolarity,
ty::ReprOptions,
Expand Down
249 changes: 249 additions & 0 deletions compiler/rustc_mir_transform/src/deduce_param_attrs.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
//! Deduces supplementary parameter attributes from MIR.
//!
//! Deduced parameter attributes are those that can only be soundly determined by examining the
//! body of the function instead of just the signature. These can be useful for optimization
//! purposes on a best-effort basis. We compute them here and store them into the crate metadata so
//! dependent crates can use them.

use rustc_hir::def_id::DefId;
use rustc_index::bit_set::BitSet;
use rustc_middle::mir::visit::{NonMutatingUseContext, PlaceContext, Visitor};
use rustc_middle::mir::{Body, Local, Location, Operand, Terminator, TerminatorKind, RETURN_PLACE};
use rustc_middle::ty::{self, DeducedParamAttrs, ParamEnv, Ty, TyCtxt};
use rustc_session::config::OptLevel;
use rustc_span::DUMMY_SP;

/// A visitor that determines which arguments have been mutated. We can't use the mutability field
/// on LocalDecl for this because it has no meaning post-optimization.
struct DeduceReadOnly {
/// Each bit is indexed by argument number, starting at zero (so 0 corresponds to local decl
/// 1). The bit is true if the argument may have been mutated or false if we know it hasn't
/// been up to the point we're at.
mutable_args: BitSet<usize>,
}

impl DeduceReadOnly {
/// Returns a new DeduceReadOnly instance.
fn new(arg_count: usize) -> Self {
Self { mutable_args: BitSet::new_empty(arg_count) }
}
}

impl<'tcx> Visitor<'tcx> for DeduceReadOnly {
fn visit_local(&mut self, local: Local, mut context: PlaceContext, _: Location) {
// We're only interested in arguments.
if local == RETURN_PLACE || local.index() > self.mutable_args.domain_size() {
return;
}

// Replace place contexts that are moves with copies. This is safe in all cases except
// function argument position, which we already handled in `visit_terminator()` by using the
// ArgumentChecker. See the comment in that method for more details.
//
// In the future, we might want to move this out into a separate pass, but for now let's
// just do it on the fly because that's faster.
if matches!(context, PlaceContext::NonMutatingUse(NonMutatingUseContext::Move)) {
context = PlaceContext::NonMutatingUse(NonMutatingUseContext::Copy);
}

match context {
PlaceContext::MutatingUse(..)
| PlaceContext::NonMutatingUse(NonMutatingUseContext::Move) => {
// This is a mutation, so mark it as such.
self.mutable_args.insert(local.index() - 1);
}
PlaceContext::NonMutatingUse(..) | PlaceContext::NonUse(..) => {
// Not mutating, so it's fine.
}
}
}

fn visit_terminator(&mut self, terminator: &Terminator<'tcx>, location: Location) {
// OK, this is subtle. Suppose that we're trying to deduce whether `x` in `f` is read-only
// and we have the following:
//
// fn f(x: BigStruct) { g(x) }
// fn g(mut y: BigStruct) { y.foo = 1 }
//
// If, at the generated MIR level, `f` turned into something like:
//
// fn f(_1: BigStruct) -> () {
// let mut _0: ();
// bb0: {
// _0 = g(move _1) -> bb1;
// }
// ...
// }
//
// then it would be incorrect to mark `x` (i.e. `_1`) as `readonly`, because `g`'s write to
// its copy of the indirect parameter would actually be a write directly to the pointer that
// `f` passes. Note that function arguments are the only situation in which this problem can
// arise: every other use of `move` in MIR doesn't actually write to the value it moves
// from.
//
// Anyway, right now this situation doesn't actually arise in practice. Instead, the MIR for
// that function looks like this:
//
// fn f(_1: BigStruct) -> () {
// let mut _0: ();
// let mut _2: BigStruct;
// bb0: {
// _2 = move _1;
// _0 = g(move _2) -> bb1;
// }
// ...
// }
//
// Because of that extra move that MIR construction inserts, `x` (i.e. `_1`) can *in
// practice* safely be marked `readonly`.
//
// To handle the possibility that other optimizations (for example, destination propagation)
// might someday generate MIR like the first example above, we panic upon seeing an argument
// to *our* function that is directly moved into *another* function as an argument. Having
// eliminated that problematic case, we can safely treat moves as copies in this analysis.
//
// In the future, if MIR optimizations cause arguments of a caller to be directly moved into
// the argument of a callee, we can just add that argument to `mutated_args` instead of
// panicking.
//
// Note that, because the problematic MIR is never actually generated, we can't add a test
// case for this.

if let TerminatorKind::Call { ref args, .. } = terminator.kind {
for arg in args {
if let Operand::Move(_) = *arg {
// ArgumentChecker panics if a direct move of an argument from a caller to a
// callee was detected.
//
// If, in the future, MIR optimizations cause arguments to be moved directly
// from callers to callees, change the panic to instead add the argument in
// question to `mutating_uses`.
ArgumentChecker::new(self.mutable_args.domain_size())
.visit_operand(arg, location)
}
}
};

self.super_terminator(terminator, location);
}
}

/// A visitor that simply panics if a direct move of an argument from a caller to a callee was
/// detected.
struct ArgumentChecker {
/// The number of arguments to the calling function.
arg_count: usize,
}

impl ArgumentChecker {
/// Creates a new ArgumentChecker.
fn new(arg_count: usize) -> Self {
Self { arg_count }
}
}

impl<'tcx> Visitor<'tcx> for ArgumentChecker {
fn visit_local(&mut self, local: Local, context: PlaceContext, _: Location) {
// Check to make sure that, if this local is an argument, we didn't move directly from it.
if matches!(context, PlaceContext::NonMutatingUse(NonMutatingUseContext::Move))
&& local != RETURN_PLACE
&& local.index() <= self.arg_count
{
// If, in the future, MIR optimizations cause arguments to be moved directly from
// callers to callees, change this panic to instead add the argument in question to
// `mutating_uses`.
panic!("Detected a direct move from a caller's argument to a callee's argument!")
}
}
}

/// Returns true if values of a given type will never be passed indirectly, regardless of ABI.
fn type_will_always_be_passed_directly<'tcx>(ty: Ty<'tcx>) -> bool {
matches!(
ty.kind(),
ty::Bool
| ty::Char
| ty::Float(..)
| ty::Int(..)
| ty::RawPtr(..)
| ty::Ref(..)
| ty::Slice(..)
| ty::Uint(..)
)
}

/// Returns the deduced parameter attributes for a function.
///
/// Deduced parameter attributes are those that can only be soundly determined by examining the
/// body of the function instead of just the signature. These can be useful for optimization
/// purposes on a best-effort basis. We compute them here and store them into the crate metadata so
/// dependent crates can use them.
pub fn deduced_param_attrs<'tcx>(tcx: TyCtxt<'tcx>, def_id: DefId) -> &'tcx [DeducedParamAttrs] {
// This computation is unfortunately rather expensive, so don't do it unless we're optimizing.
// Also skip it in incremental mode.
if tcx.sess.opts.optimize == OptLevel::No || tcx.sess.opts.incremental.is_some() {
return &[];
}

// If the Freeze language item isn't present, then don't bother.
if tcx.lang_items().freeze_trait().is_none() {
return &[];
}

// Codegen won't use this information for anything if all the function parameters are passed
// directly. Detect that and bail, for compilation speed.
let fn_ty = tcx.type_of(def_id);
if matches!(fn_ty.kind(), ty::FnDef(..)) {
if fn_ty
.fn_sig(tcx)
.inputs()
.skip_binder()
.iter()
.cloned()
.all(type_will_always_be_passed_directly)
{
return &[];
}
}

// Don't deduce any attributes for functions that have no MIR.
if !tcx.is_mir_available(def_id) {
return &[];
}

// Deduced attributes for other crates should be read from the metadata instead of via this
// function.
debug_assert!(def_id.is_local());

// Grab the optimized MIR. Analyze it to determine which arguments have been mutated.
let body: &Body<'tcx> = tcx.optimized_mir(def_id);
let mut deduce_read_only = DeduceReadOnly::new(body.arg_count);
deduce_read_only.visit_body(body);

// Set the `readonly` attribute for every argument that we concluded is immutable and that
// contains no UnsafeCells.
//
// FIXME: This is overly conservative around generic parameters: `is_freeze()` will always
// return false for them. For a description of alternatives that could do a better job here,
// see [1].
//
// [1]: https:/rust-lang/rust/pull/103172#discussion_r999139997
let mut deduced_param_attrs = tcx.arena.alloc_from_iter(
body.local_decls.iter().skip(1).take(body.arg_count).enumerate().map(
|(arg_index, local_decl)| DeducedParamAttrs {
read_only: !deduce_read_only.mutable_args.contains(arg_index)
&& local_decl.ty.is_freeze(tcx.at(DUMMY_SP), ParamEnv::reveal_all()),
},
),
);

// Trailing parameters past the size of the `deduced_param_attrs` array are assumed to have the
// default set of attributes, so we don't have to store them explicitly. Pop them off to save a
// few bytes in metadata.
while deduced_param_attrs.last() == Some(&DeducedParamAttrs::default()) {
let last_index = deduced_param_attrs.len() - 1;
deduced_param_attrs = &mut deduced_param_attrs[0..last_index];
}

deduced_param_attrs
}
2 changes: 2 additions & 0 deletions compiler/rustc_mir_transform/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ mod const_prop_lint;
mod coverage;
mod dead_store_elimination;
mod deaggregator;
mod deduce_param_attrs;
mod deduplicate_blocks;
mod deref_separator;
mod dest_prop;
Expand Down Expand Up @@ -139,6 +140,7 @@ pub fn provide(providers: &mut Providers) {
promoted_mir_of_const_arg: |tcx, (did, param_did)| {
promoted_mir(tcx, ty::WithOptConstParam { did, const_param_did: Some(param_did) })
},
deduced_param_attrs: deduce_param_attrs::deduced_param_attrs,
..*providers
};
}
Expand Down
1 change: 1 addition & 0 deletions compiler/rustc_query_impl/src/on_disk_cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -848,6 +848,7 @@ impl_ref_decoder! {<'tcx>
rustc_span::def_id::DefId,
rustc_span::def_id::LocalDefId,
(rustc_middle::middle::exported_symbols::ExportedSymbol<'tcx>, rustc_middle::middle::exported_symbols::SymbolExportInfo),
ty::DeducedParamAttrs,
}

//- ENCODING -------------------------------------------------------------------
Expand Down
Loading

0 comments on commit eecde58

Please sign in to comment.