Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid regexp property escape should throw an __early__ SyntaxError. #918

Closed
leobalter opened this issue May 17, 2017 · 4 comments
Closed
Labels
needs consensus This needs committee consensus before it can be eligible to be merged.

Comments

@leobalter
Copy link
Member

As verified in Test262, I'd like to confirm a mal formed property escape in a regexp should throw an Early Error.

I can't write a patch for this right now, but I want to keep track of this issue.

From @anba comments:

ES 5.1 had:

An implementation must treat any instance of the following kinds of errors as an early error:

  • [...]
  • Errors in regular expression literals that are not implementation-defined syntax extensions.

Somehow this was dropped in ES6 which leads to at least one contradiction, in https://tc39.github.io/ecma262/#sec-pattern, we've got:

[...] The algorithms in 21.2.2 are designed so that compiling a pattern may throw a SyntaxError exception; on the other hand, once the pattern is successfully compiled, applying the resulting internal procedure to find a match in a String cannot throw an exception (except for any host-defined exceptions that can occur anywhere such as out-of-memory).

But a regular expression like /a|(?![\d-\d])/u may or may not evaluate the right-hand side disjunction (which contains an invalid range expression), it depends on the input string.

So it should be listed as an early error in the spec, but no one spend time on correcting this issue...

@leobalter leobalter added the needs consensus This needs committee consensus before it can be eligible to be merged. label May 17, 2017
@allenwb
Copy link
Member

allenwb commented Sep 13, 2017

Somehow this was dropped in ES6 which leads to at least one contradiction, in https://tc39.github.io/ecma262/#sec-pattern, we've got:

Starting with ES6 this is now covered by the first rule of 12.2.8.1. However, that rule should probably be elaborated to parse using an appropiate U value based upon whether or not u is in the FlagText part.

@littledan
Copy link
Member

@allenwb Would you consider that the same issue or a separate issue?

@anba
Copy link
Contributor

anba commented Sep 13, 2017

Somehow this was dropped in ES6 which leads to at least one contradiction, in https://tc39.github.io/ecma262/#sec-pattern, we've got:

Starting with ES6 this is now covered by the first rule of 12.2.8.1. However, that rule should probably be elaborated to parse using an appropiate U value based upon whether or not u is in the FlagText part.

12.2.8.1 only covers parsing the RegExp literal using the grammar in 21.2.1. But parsing it with 21.2.1's grammar does not include executing the runtime checks, like the one in 21.2.2.15.1 Runtime Semantics: CharacterRange, step 1.

One example for a RegExp which per the current spec, may or may not throw a SyntaxError at runtime, was given in my original comment: /a|(?![\d-\d])/u.

ES5 had this covered (more or less, probably depending on how you interpret it), in ch16:

Errors in regular expression literals that are not implementation-defined syntax extensions.

But this early error restriction was dropped in ES6.

@allenwb
Copy link
Member

allenwb commented Sep 13, 2017

The intent was that the early error rule in 12.2.8.1 subsumed the ES5 Chapter 16 rule about RegExp early errors. However, I guess that wasn't made clear in 12.2.8.1.

The algorithmic syntax error detection in the RegExp evaluation algorithms really shouldn't be there. They are a carry over from Edition 3 where the spec. which did not use separate static semantic clauses listing syntax-derived early errors (and hence the need for the Ch 16 rule).

What I should have done in ES6 is to factor all of those syntax error checks out of the RegExp evaluation algorithms and restated them as early error rules in 21.2.1.1. Not doing so was an oversight on my part, or perhaps I ran out of time.

The purpose of placing these errors into 21.2.1.1 (and with the early error static semantic errors, in general) is to clearly state the errors that identify malformed ES code that can be detected and reported independently of an run-time context or values. The goal is to eliminate implementation variations in when such errors are reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs consensus This needs committee consensus before it can be eligible to be merged.
Projects
None yet
Development

No branches or pull requests

4 participants