Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine scopes around temporaries generated in local accesses #92508

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock
Original file line number Diff line number Diff line change
Expand Up @@ -4134,6 +4134,7 @@ dependencies = [
"rustc_session",
"rustc_span",
"rustc_target",
"smallvec",
"tracing",
]

Expand Down
46 changes: 46 additions & 0 deletions compiler/rustc_middle/src/middle/region.rs
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,15 @@ pub struct ScopeTree {
/// escape into 'static and should have no local cleanup scope.
rvalue_scopes: FxHashMap<hir::ItemLocalId, Option<Scope>>,

/// Sometimes destruction scopes are too overarching in the sense that
/// all temporaries are dropped while it may be the case that
/// only a subset of temporaries is requested to be dropped.
/// For instance, `x[y.z()]` indexing operation can discard the temporaries
/// generated in evaluating `y.z()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, x[y.z()] indexing operation can discard the temporaries generated in evaluating y.z().

I'll admit, I still need to reload my mental cache here, but: Is this actually true? Aren't there cases where the value produced by y.z() may hold references to those temporaries, so you'd need to keep them alive across the evaluation of the indexing operation?

I'll do some investigation to double-check my understanding here, and see if I can convince myself that you are correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(While talking to niko about this, he notes the example of mutex.lock().something() as a potentially problematic example...)

Copy link
Contributor Author

@dingxiangfei2009 dingxiangfei2009 Feb 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was initially bending my mind as well. Let me try to elaborate. So the indexing operation, either through Index or IndexMut, produces borrows with lifetimes that must outlives self.
Let us take Index as an example. In this example, we attempt to allow indexing into A with a temporary reference &(), producing a value of type &B<'_>. The result of this indexing will need the temporary index &() to be kept alive

use std::ops::Index;

struct A;

struct B<'a>(&'a ());

impl<'a> B<'a> {
    fn foo(&self) {}
}

impl<'a> Index<&'a ()> for A {
    type Output = B<'a>;
    fn index<'s>(&'s self, idx: &'a ()) -> &'s Self::Output {
        unimplemented!() // (1)
    }
}

fn main() {
    let a = A;
    a[&()].foo();
}

At (1), we are expected to construct a borrow with a lifetime 's. However, the Index trait definition does allow us to impose Self::Output: 's or any outliving relation between 'a and 's in this scope. Overall, this means that if the index, which is '_ () in this example, has a lifetime 'a, 's must outlives 'a.

A side track I think there should be an implicit bound of Self::Output: 's to be satisfied, but the current compiler does not complain as long as the return value is !.
So the remaining question is in which way we shall honor this contract. Indeed, this may have been upheld so far through enlarging the terminating scope.

Regarding the example from @nikomatsakis, let us example the following elaborated code.

use std::ops::{Index, IndexMut, DerefMut};
use std::cell::Cell;
use std::sync::Mutex;

struct A<'a>(Cell<&'a mut ()>);

impl<'a> Index<&'a mut ()> for A<'a> {
    type Output = ();
    fn index(&self, idx: &'a mut ()) -> &Self::Output {
        &*self.0.replace(idx)
    }
}

impl<'a> IndexMut<&'a mut ()> for A<'a> {
    fn index_mut(&mut self, idx: &'a mut ()) -> &mut Self::Output {
        self.0.replace(idx)
    }
}

struct B(());

impl B {
    fn new() -> B {
        B(())
    }
    fn foo(&mut self) -> &mut () {
        &mut self.0
    }
}

fn main() {
    let t = &mut ();
    let mut a = A(Cell::new(t));
    let i = Mutex::new(());
    a[i.lock().unwrap().deref_mut()] = (); // <- temporary value is freed at the end of this statement
    a[B::new().foo()] = (); // <- same reason
    a[&mut ()] = ();
}

The expression a[i.lock().unwrap().deref_mut()] = () is currently rejected because the temporary mutex guard mutex.lock() is preserved only until the end of this assignment.

In any case, I am also experimenting ways to produce examples that requiring living temporaries after index to be extra sure.

Copy link
Member

@pnkfelix pnkfelix Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

It looks like you are making an argument based on the lifetimes embedded in the Idx type of trait Index<Idx>.

I think I can see the reasoning there; its something like "the definition of the Index trait: pub trait Index<Idx: ?Sized> { type Output: ?Sized; fn index(&self, index: Idx) -> &Self::Output; }, will force any free lifetimes associated with the Idx to be bound at the scope of the impl<'a, 'b, ...> Index<...> for Concrete<...> for any concrete type Concrete<...>."

My follow-up question then is: Is there a case where the Idx type doesn't have any lifetimes in it, but the scope of temporaries of the expression computing Idx are nonetheless significant (and where the change suggested here could break code)?

My current suspicion is that the most natural cases where that could arise is that it could only happen in unsafe code that is incorrect (i.e., unsafe code that should be using lifetimes to express the constraints between the temporaries and the dynamic extent of the produced value, but is sidestepping doing so, and in the process opens up potential issues where user code can violate those now implicit constraints).

  • I think we have precedent for making changes like that, but I also think we need to be careful whenever we do so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In case its not clear, I'm going to poke around a bit more and see if I can make my concerns here concrete. My current thinking is that you may indeed be safe in the assertions you are making here, but I want to make sure I understand the edge cases really well.)

/// `eager_scopes` is a map from items to a region smaller than a destruction scope
/// so that the temporaries do not need to linger unnecessarily.
eager_scopes: FxHashMap<hir::ItemLocalId, Scope>,

/// If there are any `yield` nested within a scope, this map
/// stores the `Span` of the last one and its index in the
/// postorder of the Visitor traversal on the HIR.
Expand Down Expand Up @@ -358,6 +367,38 @@ impl ScopeTree {
self.rvalue_scopes.insert(var, lifetime);
}

pub fn record_local_access_scope(&mut self, var: hir::ItemLocalId, proposed_lifetime: Scope) {
debug!(
"record_local_access_scope(var={:?}, proposed_lifetime={:?})",
var, proposed_lifetime
);
let mut id = Scope { id: var, data: ScopeData::Node };

while let Some(&(p, _)) = self.parent_map.get(&id) {
match p.data {
ScopeData::Destruction => return,
_ => id = p,
}
if id == proposed_lifetime {
// proposed lifetime outlives the destruction lifetime
self.record_rvalue_scope(var, Some(proposed_lifetime));
return;
}
}
}

pub fn record_eager_scope(&mut self, var: hir::ItemLocalId, proposed_lifetime: Scope) {
debug!("record_eager_scope(var={:?}, proposed_lifetime={:?})", var, proposed_lifetime);
if let Some(destruction) = self.temporary_scope(var) {
if self.is_subscope_of(destruction, proposed_lifetime) {
// If the current temporary scope is already a subset of the proposed region,
// it is safe to keep this scope and discard the proposal.
return;
}
}
self.eager_scopes.insert(var, proposed_lifetime);
}

/// Returns the narrowest scope that encloses `id`, if any.
pub fn opt_encl_scope(&self, id: Scope) -> Option<Scope> {
self.parent_map.get(&id).cloned().map(|(p, _)| p)
Expand Down Expand Up @@ -393,6 +434,9 @@ impl ScopeTree {
}
_ => id = p,
}
if let Some(&eager_scope) = self.eager_scopes.get(&id.item_local_id()) {
return Some(eager_scope);
}
}

debug!("temporary_scope({:?}) = None", expr_id);
Expand Down Expand Up @@ -444,6 +488,7 @@ impl<'a> HashStable<StableHashingContext<'a>> for ScopeTree {
ref var_map,
ref destruction_scopes,
ref rvalue_scopes,
ref eager_scopes,
ref yield_in_scope,
} = *self;

Expand All @@ -456,6 +501,7 @@ impl<'a> HashStable<StableHashingContext<'a>> for ScopeTree {
var_map.hash_stable(hcx, hasher);
destruction_scopes.hash_stable(hcx, hasher);
rvalue_scopes.hash_stable(hcx, hasher);
eager_scopes.hash_stable(hcx, hasher);
yield_in_scope.hash_stable(hcx, hasher);
}
}
1 change: 1 addition & 0 deletions compiler/rustc_passes/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ edition = "2021"

[dependencies]
tracing = "0.1"
smallvec = { version = "1.6.1", features = ["union", "may_dangle"] }
rustc_middle = { path = "../rustc_middle" }
rustc_attr = { path = "../rustc_attr" }
rustc_data_structures = { path = "../rustc_data_structures" }
Expand Down
155 changes: 149 additions & 6 deletions compiler/rustc_passes/src/region.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ use rustc_middle::ty::query::Providers;
use rustc_middle::ty::TyCtxt;
use rustc_span::source_map;
use rustc_span::Span;
use smallvec::SmallVec;

use std::mem;

Expand Down Expand Up @@ -203,6 +204,105 @@ fn resolve_stmt<'tcx>(visitor: &mut RegionResolutionVisitor<'tcx>, stmt: &'tcx h
visitor.cx.parent = prev_parent;
}

fn mark_local_terminating_scopes<'tcx>(expr: &'tcx hir::Expr<'tcx>) -> FxHashSet<hir::ItemLocalId> {
struct LocalAccessResolutionVisitor<'a> {
locals: &'a mut FxHashSet<hir::ItemLocalId>,
}
impl<'a> LocalAccessResolutionVisitor<'a> {
fn probe<'b>(&mut self, expr: &'b Expr<'b>) {
if self.locals.contains(&expr.hir_id.local_id) {
return;
}
let mut nested_expr = expr;
let mut ops = SmallVec::<[_; 4]>::new();
enum OpTy {
Ref,
Deref,
Project,
}
// Here we are looking for a chain of access to extract a place reference for
// a local variable.
// For instance, given `&*(&*x[y]).z().u`, `probe` will pick out
// the sub-expressions `&*x[y]` and `y` as expressions making accesses to locals.
// The place references generated can be discarded earlier as MIR allows,
// and, thus, can be assigned a smaller region.
loop {
match nested_expr.kind {
hir::ExprKind::Path(hir::QPath::Resolved(
_,
hir::Path { res: hir::def::Res::Local(_), .. },
)) => {
// We have reached the end of the access path
// because a local variable access is encountered
break;
}
hir::ExprKind::AddrOf(_, _, subexpr) => {
ops.push((nested_expr, OpTy::Ref));
nested_expr = subexpr;
}
hir::ExprKind::Unary(hir::UnOp::Deref, subexpr) => {
ops.push((nested_expr, OpTy::Deref));
nested_expr = subexpr;
}
hir::ExprKind::Field(subexpr, _) | hir::ExprKind::Index(subexpr, _) => {
ops.push((nested_expr, OpTy::Project));
nested_expr = subexpr;
}
_ => {
drop(ops);
intravisit::walk_expr(self, expr);
return;
}
}
}
let mut ref_level = SmallVec::<[_; 4]>::new();
for (expr, op) in ops.into_iter().rev() {
match op {
OpTy::Ref => {
ref_level.push(expr);
}
OpTy::Deref => {
if let Some(ref_expr) = ref_level.pop() {
self.locals.insert(ref_expr.hir_id.local_id);
}
}
OpTy::Project => {
// The generated MIR produces temporaries for the each of the nested references,
// such as `&x`, `&&x` in `(&&&x).y` for the field access of `y`.
// These temporaries must stay alive even after the field access.
ref_level.clear();
}
}
}
self.locals.insert(expr.hir_id.local_id);
}
}
impl<'a, 'b> Visitor<'a> for LocalAccessResolutionVisitor<'b> {
fn visit_expr(&mut self, expr: &'a Expr<'a>) {
match expr.kind {
hir::ExprKind::AddrOf(..)
| hir::ExprKind::Unary(hir::UnOp::Deref, _)
| hir::ExprKind::Field(..)
| hir::ExprKind::Index(..)
| hir::ExprKind::Path(..) => self.probe(expr),

// We do not probe into other function bodies or blocks,
// neither `if`s and `match`es because they will be covered in deeper visits
hir::ExprKind::If(..)
| hir::ExprKind::Match(..)
| hir::ExprKind::Block(..)
| hir::ExprKind::Closure(..)
| hir::ExprKind::ConstBlock(..) => {}
_ => intravisit::walk_expr(self, expr),
}
}
}
let mut locals = Default::default();
let mut resolution_visitor = LocalAccessResolutionVisitor { locals: &mut locals };
resolution_visitor.visit_expr(expr);
locals
}

fn resolve_expr<'tcx>(visitor: &mut RegionResolutionVisitor<'tcx>, expr: &'tcx hir::Expr<'tcx>) {
debug!("resolve_expr - pre-increment {} expr = {:?}", visitor.expr_and_pat_count, expr);

Expand Down Expand Up @@ -396,7 +496,15 @@ fn resolve_expr<'tcx>(visitor: &mut RegionResolutionVisitor<'tcx>, expr: &'tcx h
let expr_cx = visitor.cx;
visitor.enter_scope(Scope { id: then.hir_id.local_id, data: ScopeData::IfThen });
visitor.cx.var_parent = visitor.cx.parent;
visitor.visit_expr(cond);
{
visitor.visit_expr(cond);
let lifetime = Scope { id: cond.hir_id.local_id, data: ScopeData::Node };
for local in mark_local_terminating_scopes(cond) {
if local != cond.hir_id.local_id {
visitor.scope_tree.record_local_access_scope(local, lifetime);
}
}
}
visitor.visit_expr(then);
visitor.cx = expr_cx;
visitor.visit_expr(otherwise);
Expand All @@ -406,11 +514,39 @@ fn resolve_expr<'tcx>(visitor: &mut RegionResolutionVisitor<'tcx>, expr: &'tcx h
let expr_cx = visitor.cx;
visitor.enter_scope(Scope { id: then.hir_id.local_id, data: ScopeData::IfThen });
visitor.cx.var_parent = visitor.cx.parent;
visitor.visit_expr(cond);
{
visitor.visit_expr(cond);
let lifetime = Scope { id: cond.hir_id.local_id, data: ScopeData::Node };
for local in mark_local_terminating_scopes(cond) {
if local != cond.hir_id.local_id {
visitor.scope_tree.record_local_access_scope(local, lifetime);
}
}
}
visitor.visit_expr(then);
visitor.cx = expr_cx;
}

hir::ExprKind::Match(subexpr, arms, _) => {
visitor.visit_expr(subexpr);
let lifetime = Scope { id: subexpr.hir_id.local_id, data: ScopeData::Node };
for local in mark_local_terminating_scopes(subexpr) {
if local != subexpr.hir_id.local_id {
visitor.scope_tree.record_local_access_scope(local, lifetime);
}
}
walk_list!(visitor, visit_arm, arms);
}

hir::ExprKind::Index(subexpr, idxexpr) => {
visitor.visit_expr(subexpr);
visitor.visit_expr(idxexpr);
visitor.scope_tree.record_eager_scope(
idxexpr.hir_id.local_id,
Scope { id: expr.hir_id.local_id, data: ScopeData::Node },
);
}

_ => intravisit::walk_expr(visitor, expr),
}

Expand Down Expand Up @@ -683,10 +819,17 @@ fn resolve_local<'tcx>(
visitor.scope_tree.record_rvalue_scope(expr.hir_id.local_id, blk_scope);

match expr.kind {
hir::ExprKind::AddrOf(_, _, ref subexpr)
| hir::ExprKind::Unary(hir::UnOp::Deref, ref subexpr)
| hir::ExprKind::Field(ref subexpr, _)
| hir::ExprKind::Index(ref subexpr, _) => {
hir::ExprKind::AddrOf(_, _, subexpr)
| hir::ExprKind::Unary(hir::UnOp::Deref, subexpr)
| hir::ExprKind::Field(subexpr, _)
| hir::ExprKind::Index(subexpr, _) => {
if let hir::ExprKind::Path(hir::QPath::Resolved(
_,
hir::Path { res: hir::def::Res::Local(_), .. },
)) = &subexpr.kind
{
return;
}
expr = &subexpr;
}
_ => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@
- StorageDead(_13); // scope 3 at $DIR/remove_storage_markers.rs:9:16: 9:17
_6 = const (); // scope 3 at $DIR/remove_storage_markers.rs:8:20: 10:6
- StorageDead(_12); // scope 2 at $DIR/remove_storage_markers.rs:10:5: 10:6
- StorageDead(_9); // scope 2 at $DIR/remove_storage_markers.rs:10:5: 10:6
- StorageDead(_7); // scope 2 at $DIR/remove_storage_markers.rs:10:5: 10:6
- StorageDead(_6); // scope 2 at $DIR/remove_storage_markers.rs:10:5: 10:6
_5 = const (); // scope 2 at $DIR/remove_storage_markers.rs:8:5: 10:6
Expand All @@ -80,7 +79,6 @@

bb3: {
_0 = const (); // scope 2 at $DIR/remove_storage_markers.rs:8:5: 10:6
- StorageDead(_9); // scope 2 at $DIR/remove_storage_markers.rs:10:5: 10:6
- StorageDead(_7); // scope 2 at $DIR/remove_storage_markers.rs:10:5: 10:6
- StorageDead(_6); // scope 2 at $DIR/remove_storage_markers.rs:10:5: 10:6
- StorageDead(_4); // scope 1 at $DIR/remove_storage_markers.rs:10:5: 10:6
Expand All @@ -91,6 +89,7 @@

bb4: {
- StorageDead(_14); // scope 5 at $DIR/remove_storage_markers.rs:8:14: 8:19
- StorageDead(_9); // scope 2 at $DIR/remove_storage_markers.rs:8:18: 8:19
- StorageDead(_8); // scope 2 at $DIR/remove_storage_markers.rs:8:18: 8:19
_10 = discriminant(_7); // scope 2 at $DIR/remove_storage_markers.rs:8:14: 8:19
switchInt(move _10) -> [0_isize: bb3, otherwise: bb2]; // scope 2 at $DIR/remove_storage_markers.rs:8:14: 8:19
Expand Down
9 changes: 4 additions & 5 deletions src/test/ui/async-await/issue-64130-4-async-move.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,15 @@ impl Client {
}
}

async fn get() { }
async fn get() {}

pub fn foo() -> impl Future + Send {
//~^ ERROR future cannot be sent between threads safely
let client = Client(Box::new(true));
async move {
match client.status() {
async {
match Client(Box::new(true)).status() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just cleaning up the test, or were changes needed so that it still tested the same behavior as before? In other words, did the error from the previous version go away due to your changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error goes away with this change. However, I checked the issue #64130 and it seems that this test is intended for prompting hints for remedies when values living through yield points make generators non-Send or non-Sync. Therefore, I figured that it is good to keep this test by reproducing the previous errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

200 => {
let _x = get().await;
},
}
_ => (),
}
}
Expand Down
16 changes: 8 additions & 8 deletions src/test/ui/async-await/issue-64130-4-async-move.stderr
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,21 @@ LL | pub fn foo() -> impl Future + Send {
|
= help: the trait `Sync` is not implemented for `(dyn Any + Send + 'static)`
note: future is not `Send` as this value is used across an await
--> $DIR/issue-64130-4-async-move.rs:21:31
--> $DIR/issue-64130-4-async-move.rs:20:31
|
LL | match client.status() {
| ------ has type `&Client` which is not `Send`
LL | match Client(Box::new(true)).status() {
| ---------------------- has type `&Client` which is not `Send`
LL | 200 => {
LL | let _x = get().await;
| ^^^^^^ await occurs here, with `client` maybe used later
| ^^^^^^ await occurs here, with `Client(Box::new(true))` maybe used later
...
LL | }
| - `client` is later dropped here
| - `Client(Box::new(true))` is later dropped here
help: consider moving this into a `let` binding to create a shorter lived borrow
--> $DIR/issue-64130-4-async-move.rs:19:15
--> $DIR/issue-64130-4-async-move.rs:18:15
|
LL | match client.status() {
| ^^^^^^^^^^^^^^^
LL | match Client(Box::new(true)).status() {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

Loading