Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflect parentheses that discard extra values in the AST #79

Open
fstirlitz opened this issue Dec 12, 2019 · 5 comments · May be fixed by #98
Open

Reflect parentheses that discard extra values in the AST #79

fstirlitz opened this issue Dec 12, 2019 · 5 comments · May be fixed by #98
Labels
compat hazard Resolving this issue may create backwards compatibility problems incorrect parsing Valid Lua code fails to parse, is parsed incorrectly, or invalid Lua code is accepted
Milestone

Comments

@fstirlitz
Copy link
Owner

Currently, the latest released version does not distinguish e.g. f() from (f()) in the AST. These expressions are not equivalent; the former evaluates to all the return values of f(), and the latter evaluates to only the first of them (the rest are discarded). Current git master distinguishes them by the inParens field, but that is just a workaround for #24 and I expect to remove it when that issue is fixed properly.

A more principled approach to distinguishing these expressions is desired. I may keep the inParens field, or I may introduce a new kind of node that expresses the operation of discarding extra values.

@fstirlitz fstirlitz added the compat hazard Resolving this issue may create backwards compatibility problems label Dec 12, 2019
@fstirlitz
Copy link
Owner Author

fstirlitz commented Jan 12, 2020

There are two kinds of expressions which may evaluate to a value tuple: function calls and the vararg literal, .... There are also five places where discarding extra values matters, all of which are in a context where a multiple-value expression appears as the last one in a comma-delimited list of expressions:

  • table constructors ({ a, b, (...) }; table constructors can also use ; as delimiters, and accept trailing delimiters)
  • simultaneous assignment (a, b, c = d, e, (...))
  • argument lists (f(a, b, (...))
  • for-in loops (for a, b, c in d, e, (...) do --[[ ... ]] end)
  • return statement (return a, b, (...))

In the last of these, parenthesising the only expression also has the effect of preventing tail-call optimisation. This suggests a representation as a new type of node, so that tail calls can be detected simply by checking whether a function-call node appears as a direct child of the return node. This will be a breaking change for users who are not prepared to deal with unknown node types; luamin by @mathiasbynens would be one. In this case it's arguably a good thing anyway, because as of now luamin will just perform minification incorrectly on code affected by this defect.

Alternatively, I might treat parentheses as a circumfix unary operator, and piggyback on the UnaryExpression node type to reflect it in the AST.

Lua manual links: 5.1 §2.5, 5.2 §3.4, 5.3 §3.4

@fstirlitz fstirlitz added the incorrect parsing Valid Lua code fails to parse, is parsed incorrectly, or invalid Lua code is accepted label Jan 15, 2020
@fstirlitz fstirlitz added this to the 0.4 milestone Jun 4, 2021
@arnoson
Copy link

arnoson commented Nov 29, 2021

Im currently updating a fork of lua-fmt that uses on old version of this library.
The needsParens method currenlty relies on node.inParens. Could you point me in the right direction of how to rewrite this methods using the new version of luaparse?

needsParens() {
  const parent = this.getParent() as luaparse.Node
  const value = this.getValue() as luaparse.Node

  let inParens = false
  switch (value.type) {
    case 'FunctionDeclaration':
    case 'Chunk':
    case 'Identifier':
    case 'BooleanLiteral':
    case 'NilLiteral':
    case 'NumericLiteral':
    case 'StringLiteral':
    case 'VarargLiteral':
    case 'TableConstructorExpression':
    case 'BinaryExpression':
    case 'LogicalExpression':
    case 'UnaryExpression':
    case 'MemberExpression':
    case 'IndexExpression':
    case 'CallExpression':
    case 'TableCallExpression':
    case 'StringCallExpression':
      inParens = value.inParens || false
  }

  if (parent) {
    /*
          If this UnaryExpression is nested below another UnaryExpression, wrap the nested expression in
          parens. This not only improves readability of complex expressions, but also prevents `- -1` from
          becoming `--1`, which would result in a comment.
      */
    if (value.type === 'UnaryExpression' && parent.type === 'UnaryExpression') {
      inParens = true
    }
  }

  return inParens
}

@arnoson
Copy link

arnoson commented Nov 29, 2021

I found a solution (mixing prettier's lua and php plugin approaches). The only thing that is missing for now is distinguishing fun() from (fun()) so I can keep the parens.
Thanks for the great work!

@dlbd
Copy link

dlbd commented Sep 23, 2022

Any update on this? I was working on a custom Lua minifier with a somewhat better compression ratio than luamin, and am currently stuck with a luaparse version from 2019 (before .inParens was removed).

@cohler
Copy link

cohler commented Dec 20, 2022

This is a MAJOR BUG giving a totally useless and INCORRECT parse. Why hasn't anyone fixed this yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat hazard Resolving this issue may create backwards compatibility problems incorrect parsing Valid Lua code fails to parse, is parsed incorrectly, or invalid Lua code is accepted
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants