Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heisen-variables in lists #6141

Closed
michalmuskala opened this issue Jul 7, 2022 · 14 comments
Closed

Heisen-variables in lists #6141

michalmuskala opened this issue Jul 7, 2022 · 14 comments
Labels
bug Issue is reported as a bug not a bug Issue is determined as not a bug by OTP team:VM Assigned to OTP team VM

Comments

@michalmuskala
Copy link
Contributor

michalmuskala commented Jul 7, 2022

Describe the bug
The compiler is inconsistent with its view whether expressions inside lists are evaluated in parallel or in sequence.

To Reproduce
Given the following code:

-module(test).

-compile(export_all).

% test1() ->
%   [X = 1, X].

test2() ->
  [X = 1, X = 2].

Function test1 fails to compile with message variable 'X' is unbound.

However, function test2 compiles, but has semantics of code similar to [1, 1 = 2], or to Elixir's [x = 1, ^x = 2], effectively meaning the compiler considered X already bound from the first element of the list when evaluating the second element of the list.

Expected behavior
Variables defined in earlier elements should either always be undefined or always defined in both expression and pattern positions.

Affected versions
Tested on Erlang/OTP 25 [erts-13.0.1]

@michalmuskala michalmuskala added the bug Issue is reported as a bug label Jul 7, 2022
@RaimoNiskanen
Copy link
Contributor

Is the evaluation order when constructing a compound term really defined?
I think it is not, and then the compilation results appears valid.
The compiler could be able to produce a warning for test2(), though, since one of the matches cannot match.

@michalmuskala
Copy link
Contributor Author

This is actually even stranger in a map. The variable is used and unused at the same time:

$ cat test3.erl
-module(test3).

-compile(export_all).

% test1() ->
%   [X = 1, X].

test2() ->
    #{key1 => X = 1, key2 => X = 2}.


$ erlc test3.erl
test3.erl:3:2: Warning: export_all flag enabled - all functions will be exported
%    3| -compile(export_all).
%     |  ^

test3.erl:9:15: Warning: variable 'X' is unused
%    9|     #{key1 => X = 1, key2 => X = 2}.
%     |               ^

test3.erl:9:30: Warning: no clause will ever match
%    9|     #{key1 => X = 1, key2 => X = 2}.
%     |                              ^

test3.erl:9:30: Warning: this clause cannot match because its guard evaluates to 'false'
%    9|     #{key1 => X = 1, key2 => X = 2}.
%     |                              ^

test3.erl:9:30: Warning: variable 'X' is unused
%    9|     #{key1 => X = 1, key2 => X = 2}.
%     |                              ^

@michalmuskala
Copy link
Contributor Author

michalmuskala commented Jul 7, 2022

Is the evaluation order when constructing a compound term really defined?
I think it is not, and then the compilation results appears valid.

The order should be consistent, though, between value position and pattern position. Right now it seems that for value position (test1) the expressions are evaluated in parallel (no leaking of variables between list elements), but for pattern position (test2) the expressions are evaluated in sequence (leaking of variables between list elements).

I don't really know which behaviour should be considered "correct", but it should be consistent.

@garazdawi garazdawi added the team:VM Assigned to OTP team VM label Jul 7, 2022
@rickard-green
Copy link
Contributor

Is the evaluation order when constructing a compound term really defined?
I think it is not, and then the compilation results appears valid.

The order should be consistent, though, between value position and pattern position. Right now it seems that for value position (test1) the expressions are evaluated in parallel (no leaking of variables between list elements), but for pattern position (test2) the expressions are evaluated in sequence (leaking of variables between list elements).

I don't really know which behaviour should be considered "correct", but it should be consistent.

Why? Since the evaluation order is undefined you should not depend on the order anyway. The point of having the evaluation order not defined is so that it can be changed. A program that depends on the order is broken even though it should happen to work at some point.

I agree that the warning mess in the map case is quite confusing though...

@okeuday
Copy link
Contributor

okeuday commented Jul 7, 2022

The evaluation of arithmetic, logical, relational and bitwise operators should still be defined in Erlang. If a problem related to that ordering it would be a bug.

@RaimoNiskanen
Copy link
Contributor

This is similar to:

foo() ->
    #{ key1 => a(1), key2 => b(2) }.

where it is undefined which of the functions a/1 or b/1 that is called first.
To solve that this is an obvious solution:

foo() ->
    Val1 = b(2),
    Val2 = a(1),
    #{ key1 => Val2, key2 -> Val1 }.

Is variable binding some kind of compile time side-effect?

Nevertheless, I think there should be no defined order between variable bindings within term construction, and no requirement of the order being consistent (according to any definition of consistent).

@michalmuskala
Copy link
Contributor Author

michalmuskala commented Jul 8, 2022

The evaluation of arithmetic, logical, relational and bitwise operators should still be defined in Erlang. If a problem related to that ordering it would be a bug.

Except for the andalso and orelse operators, the order is not defined for other arithmetic, logical, relational, and bitwise operators either according to documentation.

@okeuday
Copy link
Contributor

okeuday commented Jul 8, 2022

@rickard-green
Copy link
Contributor

@michalmuskala https://www.erlang.org/doc/reference_manual/expressions.html#operator-precedence

@okeuday that does not say anything about the order of evaluation of sub-expressions

For example:

a() + b()*c()

Any evaluation order of a(), b() and c() is valid as long as the result of b() is multiplied with the result of c() before being added to the result of a().

@okeuday
Copy link
Contributor

okeuday commented Jul 9, 2022

@rickard-green The issue was created for variable use without function calls, so that appears more relevant. The remark "the evaluation order is undefined" made me think "Erlang is Chaos!" because we do have operator precedence guarantees that should remain valid. With your example we should always have the multiplication (of "b()*c()") before the addition (to "a()"). I mentioned operator precedence because it may be a separate area to look for a similar problem.

@rickard-green
Copy link
Contributor

@okeuday

@rickard-green The issue was created for variable use without function calls, so that appears more relevant.

No, function calls are expressions just as much as a match expression. The point is that it is an expression. In my opinion it makes the example clearer.

The remark "the evaluation order is undefined" made me think "Erlang is Chaos!" because we do have operator precedence guarantees that should remain valid.

Operator precedence and evaluation order of sub-expressions are two different things.

With your example we should always have the multiplication (of "b()*c()") before the addition (to "a()").

If you by that mean that a() must be evaluated after b() and c(), you are wrong. See my comment with the example...

I mentioned operator precedence because it may be a separate area to look for a similar problem.

@rickard-green rickard-green added the not a bug Issue is determined as not a bug by OTP label Jul 12, 2022
@rickard-green
Copy link
Contributor

Closing this since it isn't a bug.

@josevalim
Copy link
Contributor

josevalim commented Jul 13, 2022

@rickard-green I think that even if the evaluation order is not defined, the semantics should be well defined: i.e. we should define the semantics of what happens when there are multiple variables in a list, regardless of evaluation order. For example, in this case:

test2() ->
  [X = 1, X = 2].

We could define the semantics to be: variables defined in lists are not observed until after the list is created. If the same variable X is defined in multiple places, they have to be equal, otherwise you get a match error.

The sentence above should accurately describe the current behaviour and while asserting nothing about the order. That's also the semantics that erl_eval implements. But it could also have been: "X will be 1 or 2 and you don't know which". We could also say both evaluation order and semantics are undefined, but that's a bit too imprecise I think.

In other words, I agree there is no bug for lists because it does have clear semantics (at least to me), but I would say there is definitely a bug when it comes to maps. :)

@RaimoNiskanen
Copy link
Contributor

I have no idea how well it would be possible to define these semantics, since it is the semantics for arbitrary expressions with intermingled pattern matches that needs to be defined. A list is just a special case.

foo(A, B, C, D) ->
    {A + (Y = B + C),
     (Z = element(2, D))
         #{(X = B and C) => [Y | X = A + B],
           [X,Y]         => (Z = bar(Y = element(1, D) + X))}}.

You get the point...

Nevertheless, maybe open a broader Issue about if and how to clarify the semantics, that the language details team can ponder on after the vacations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug not a bug Issue is determined as not a bug by OTP team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

6 participants