Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further improve diagnostics for expressions in pattern position #121697

Open
8 of 11 tasks
fmease opened this issue Feb 27, 2024 · 3 comments
Open
8 of 11 tasks

Further improve diagnostics for expressions in pattern position #121697

fmease opened this issue Feb 27, 2024 · 3 comments
Assignees
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-parser Area: The parsing of Rust source code to an AST A-patterns Relating to patterns and pattern matching D-newcomer-roadblock Diagnostics: Confusing error or lint; hard to understand for new users. D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@fmease
Copy link
Member

fmease commented Feb 27, 2024

I've had this on my TODO list for a while but never had the time to write a PR for it. Therefore I'm just going to open an issue for it instead.

Since PR #118625 we try to reparse something in pattern position as an expression under certain conditions and provide a smarter diagnostic of the form “expected a pattern, found an expression // arbitrary expressions are not allowed in patterns” which is already amazing.

However, the current heuristic has a few limitations I'd like to see addressed which should be relatively simple to implement:

  1. The heuristic doesn't cover field accesses (of the form $literal.$ident and so forth) but only things that look like method calls (e.g., 0.f()). I argue we should also detect expressions like x.f and s.0. This would've prevented #104996 from getting opened! (addressed in #123877)
    • Firstly we need to remove the extra check for parentheses located after the .-“herald” check.
    • Then, we should in my opinion check the Span to make sure that there's no whitespace/comment before & after the . to avoid overzealously emitting this diagnostic when the user accidentally typoed a comma as a dot (as in (x. y)(x, y)) which isn't too farfetched given the keys are right next to each other on most keyboards (just need to check sp0.hi == sp1.lo). Of course, this would rule out multi-line method / field chains but that should be rare. We could (as a follow-up) check if the would-be expression is “appropriately” indented (by utilizing SourceMap methods) to address that, too. (NOTE: this isn't implemented yet, unclear if we actually want this)
    • The following is a bit opinionated: I think we should remove the capitalization check (at the moment, we don't consider x.Foo() to be a method call). If you disagree, we should at least also consider _ to be a valid start of a field or method name (x._foo() doesn't get recognized at the time of writing).
    • We should support sequences of numeric field accesses like x.0.1 without accidentally flagging x.0e4 as an expression (x.0.1 gets lexed as [Ident, Dot, FloatLit]). See parse_expr_tuple_field_access_float for prior art.
  2. We should consider the keyword as to be a “herald” next to . to recover 0 as usize etc. in pattern position (addressed in #123877).
  3. We should somehow recover f().g() and (0).f() etc. meaning more method calls / field accesses where the “basis” is a more complex expression. Rephrased: Recognize more “$expr.$ident” over just “$lit_or_ident.$ident” (the latter is the status quo, roughly, if I'm not mistaken).
    • Recovering from (0).f() was addressed in #123877.
  4. We should add a help message (structured or text-based) instructing the user to extract the expression to a const item: “extract the expression to a const and refer to that” (addressed in #123877).
  5. At some point we should provide a structured multipart suggestion for wrapping expressions in pattern position in a const {} block (feature inline_const_pat). We could provide the suggestion if self.sess.unstable_features.is_nightly_build() holds but I don't know if others like the idea of that. Note that we cannot check if feature inline_const_pat is enabled from inside the parser and diagnostic stashing/stealing doesn't help here either since we want to “stash” a subdiagnostic (a structured suggestion) not true actually, should be possible to impl (addressed in #123877.
  6. Various FIXMEs added in Further improve diagnostics for expressions in pattern position #123877
@fmease fmease added A-diagnostics Area: Messages for errors, warnings, and lints A-parser Area: The parsing of Rust source code to an AST T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-patterns Relating to patterns and pattern matching labels Feb 27, 2024
@fmease
Copy link
Member Author

fmease commented Feb 27, 2024

cc @ShE3py, @WaffleLapkin

@fmease fmease changed the title Further improve diagnostics for expressions in patterns Further improve diagnostics for expressions in pattern position Feb 27, 2024
@fmease fmease added D-newcomer-roadblock Diagnostics: Confusing error or lint; hard to understand for new users. D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. labels Feb 27, 2024
@fmease
Copy link
Member Author

fmease commented Feb 27, 2024

I know that this is quite a wall of text and I don't expect anyone to read it. I've tagged you, ShE3py, since you might be interested in working on some of the things I listed here. Otherwise this issue just serves as a public TODO list for me.

@ShE3py
Copy link
Contributor

ShE3py commented Apr 7, 2024

Okay, I've somehow resurrected.

1. ii. Then, we should in my opinion check the Span to make sure that there's no whitespace/comment before & after the . to avoid overzealously emitting this diagnostic when the user accidentally typoed a comma as a dot (as in (x. y)(x, y)) which isn't too farfetched given the keys are right next to each other on most keyboards

It might be worth to emit both diagnostics (e.g. “unexpected field access // try introducing a const VAL: _ = x. y // you may have meant to use a comma there”).

1. iii. The following is a bit opinionated: I think we should remove the capitalization check (at the moment, we don't consider x.Foo() to be a method call).

Likewise. Especially since some scripts e.g. Arabic doesn't have cases (the check should've been something like !c.is_uppercase()).

1. iv. See parse_expr_tuple_field_access_float for prior art.

Note to self: this was inlined into parse_dot_suffix_expr in #122858.
Permalink to that PR's parent:

fn parse_expr_tuple_field_access_float(
&mut self,
lo: Span,
base: P<Expr>,
float: Symbol,
suffix: Option<Symbol>,
) -> P<Expr> {
match self.break_up_float(float, self.token.span) {
// 1e2
DestructuredFloat::Single(sym, _sp) => {
self.parse_expr_tuple_field_access(lo, base, sym, suffix, None)
}
// 1.
DestructuredFloat::TrailingDot(sym, ident_span, dot_span) => {
assert!(suffix.is_none());
self.token = Token::new(token::Ident(sym, IdentIsRaw::No), ident_span);
let next_token = (Token::new(token::Dot, dot_span), self.token_spacing);
self.parse_expr_tuple_field_access(lo, base, sym, None, Some(next_token))
}
// 1.2 | 1.2e3
DestructuredFloat::MiddleDot(symbol1, ident1_span, dot_span, symbol2, ident2_span) => {
self.token = Token::new(token::Ident(symbol1, IdentIsRaw::No), ident1_span);
// This needs to be `Spacing::Alone` to prevent regressions.
// See issue #76399 and PR #76285 for more details
let next_token1 = (Token::new(token::Dot, dot_span), Spacing::Alone);
let base1 =
self.parse_expr_tuple_field_access(lo, base, symbol1, None, Some(next_token1));
let next_token2 = Token::new(token::Ident(symbol2, IdentIsRaw::No), ident2_span);
self.bump_with((next_token2, self.token_spacing)); // `.`
self.parse_expr_tuple_field_access(lo, base1, symbol2, suffix, None)
}
DestructuredFloat::Error => base,
}
}

I know that this is quite a wall of text and I don't expect anyone to read it. I've tagged you, ShE3py, since you might be interested in working on some of the things I listed here. Otherwise this issue just serves as a public TODO list for me.

I don't mind reading, and I often find myself writing walls of text, albeit in French, and my English isn't so good as to write pavés in it while retaining the same meaning.

@rustbot claim

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 19, 2024
Further improve diagnostics for expressions in pattern position

Follow-up of rust-lang#118625, see rust-lang#121697.

```rs
fn main() {
    match 'b' {
        y.0.0.1.z().f()? as u32 => {},
    }
}
```
Before:
```
error: expected one of `=>`, ``@`,` `if`, or `|`, found `.`
 --> src/main.rs:3:10
  |
3 |         y.0.0.1.z().f()? as u32 => {},
  |          ^ expected one of `=>`, ``@`,` `if`, or `|`
```
After:
```
error: expected a pattern, found an expression
 --> src/main.rs:3:9
  |
3 |         y.0.0.1.z().f()? as u32 => {},
  |         ^^^^^^^^^^^^^^^^^^^^^^^ arbitrary expressions are not allowed in patterns
  |
help: consider moving the expression to a match arm guard
  |
3 |         val if val == y.0.0.1.z().f()? as u32 => {},
  |         ~~~ +++++++++++++++++++++++++++++++++
help: consider extracting the expression into a `const`
  |
2 +     const VAL: /* Type */ = y.0.0.1.z().f()? as u32;
3 ~     match 'b' {
4 ~         VAL => {},
  |
help: consider wrapping the expression in an inline `const` (requires `#![feature(inline_const_pat)]`)
  |
3 |         const { y.0.0.1.z().f()? as u32 } => {},
  |         +++++++                         +
```

---

r? fmease
`@rustbot` label +A-diagnostics +A-parser +A-patterns +C-enhancement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-parser Area: The parsing of Rust source code to an AST A-patterns Relating to patterns and pattern matching D-newcomer-roadblock Diagnostics: Confusing error or lint; hard to understand for new users. D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

2 participants