Fix bytecode generation for var statements in for-in statements #52

pmarkee · 2019-01-10T09:33:02Z

Fixes #29

Signed-off-by: Peter Marki [email protected]

robertsipka · 2019-01-10T13:44:15Z

src/parser/ast/ArrowFunctionExpressionNode.h

+
+ // Do not generate bytecode for this expression if it has already been generated.
+ if (!blk)
+ return;


Could not we avoid to step into these generateExpressionByteCode functions (in case of FunctionExpression or ArrowFunctionExpression nodes) at a second time? I think these functions will be called twice in the generateStatementByteCode function of ForInSatementNode class. Can`t we just skip it at the second time? For example, if the left value of the forInStatement is a variable declaration or something else.
That's why I have another question: What about if I create an object without function expressions:

for ( var id_0 = { a: 2, b: 10 } in Array.toString ) { }

Will the engine also generate bytecodes for a and b twice or this case is handled? My assumption is that the duplication appears here as well, even though no assert will occur.

Note: these are just questions, however, this patch may be the right solution.

[FYI] by setting ENV variable DUMP_BYTECODE to any value (e.g. export DUMP_BYTECODE=1), escargot print the generated bytecode.

Source Code

for ( var id_0 = { toString : function ( ) {} } in Array.toString ) { }

Bytecode

register info.. (stack variable size(1)) [X,X,X,X,X,X,`this`,] 0 createobject -> r3 | CreateObject (line: 1:20) 48 create function -> r4 | CreateFunction (line: 1:40) 104 object define own property with name r3.toString <- r4 | ObjectDefineOwnPropertyWithNameOperation (line: 1:20) 160 set global object global.id_0 <- r3 | SetGlobalObject (line: 1:11) 232 get global object r2 <- global.Array | GetGlobalObject (line: 1:52) 304 get object r1 <- r2.toString | GetObjectPreComputedCase (line: 1:52) 432 load r2 <- undefined | LoadLiteral (line: 1:5) 488 equal r2 <- r2 , r1 | BinaryEqual (line: 1:5) 544 jump if true r2 -> -983881656 | JumpIfTrue (line: 1:5) 600 load r2 <- null | LoadLiteral (line: 1:5) 656 equal r2 <- r2 , r1 | BinaryEqual (line: 1:5) 712 jump if true r2 -> -983881656 | JumpIfTrue (line: 1:5) 768 enumerate object r1, r5 | EnumerateObject (line: 1:5) 816 check if key is last r5 | CheckIfKeyIsLast (line: 1:5) 872 enumerate object key r1 r5 | EnumerateObjectKey (line: 1:5) 920 createobject -> r2 | CreateObject (line: 1:20) 968 object define own property with name r2.toString <- r3 | ObjectDefineOwnPropertyWithNameOperation (line: 1:20) xxx create function -> r3 | CreateFunction (line: 1:40) // duplicated generation of FunctionExpression 1024 set global object global.id_0 <- r2 | SetGlobalObject (line: 1:11) 1096 jump -983882048 | Jump (line: 1:5) 1152 jump -983881656 | Jump (line: 1:5) 1208 end | End (line: 1:1)

The above shows the generated bytecode of the source code (by default, it occurs an assert error before printing the bytecode, so I reconstruct it)
In this example, generation of FunctionExpression bytecode is duplicated twice(as you commented), one for initialization of the global variable id_0 and the other for for-in statement. which is the main problem of the issue.

Bytecode (FunctionExpression ast node) for generation of each FunctionExpression should be exist only at once.
Based on this assumption, function generateExpressionByteCode of FunctionExpressionNode sequentially allocates the pre-generated CodeBlock for each FunctionExpressionNode.
But, in the above case, when second identical FunctionExpression node about to generate a bytecode, it can not find the CodeBlock because first FunctionExpression node already take the CodeBlock and increase the counter (one CodeBlock is pre-generated for each FunctionExpression).

for (size_t i = 0; i < len; i++) { CodeBlock* c = childBlocks[i]; if (c->isFunctionExpression()) { if (cnt == counter) { blk = c; break; } cnt++; } }

From the above code in generateExpressionByteCode function, it cannot match the related CodeBlock due to the already increased counter and finally assert error occurred.

This duplicated FunctionExpression node issue is only related with for-in statement (maybe same for for-of statement, but for-of is not yet implemented).
I think parsing for for-of statement should be fixed to not to generate duplicated variable declaration nodes.
However, this patch which just returns for the case of no matched CodeBlock is not a complete solution because it can not check the rule that one FunctionExpression node for one FunctionExpression generation. And this patch also removes the second FunctionExpression bytecode, assigning an empty object to variable id_0 during the for-in iteration.

Thanks for the in-depth explanation @clover2123 ! I tried to simply just skip the second generation but it makes some tests fail and causes segfaults, so I continue working on the issue and try to come up with a good solution.

pmarkee · 2019-01-15T09:35:52Z

@robertsipka @clover2123 I have updated the PR. I decided to simply get rid of the first generation of the bytecode, as the second generation will generate it and put it in the global scope anyway. I'm not sure if this is the right approach but I tried it with some test suites and it all passed.

clover2123 · 2019-01-17T07:23:19Z

@pmarkee In my opinion, the first variable declaration should be generated and executed before for-in iteration as it is. And the second variable declaration need to be fixed such that the result of for-in operation is allocated in the variable without executing the right expression.

var object = {a: 1, b: 2, c: 3}; 
function func_counter() {
    print("func_counter() called")
}
for (var property = func_counter() in object) {
    print("property : "  + property);
}

result

func_counter() called
property : a
property : b
property : c

From the above example which is obtained from JSC engine, we can find that the variable declaration is executed before the for-in iteration and func_counter() is called only once.

pmarkee · 2019-02-01T09:47:24Z

@clover2123 @robertsipka I have updated the PR, I found another possible approach. I simply set the init of the variable declarator node to nullptr, so the initial value of the variable will be completely ignored and the for-in statement will work as expected.

robertsipka · 2019-02-06T08:25:25Z

src/parser/esprima_cpp/esprima.cpp

@@ -4758,6 +4758,7 @@ class Parser : public gc {

 if (declarations.size() == 1 && this->matchKeyword(InKeyword)) {
 RefPtr<VariableDeclaratorNode> decl = declarations[0];
+ decl->setInitPtr(nullptr);


I'm not sure it would be the best to handle it here. I think it works because you avoid to call the generateExpressionByteCode call for this init node in here by set its value to nullptr. Maybe you could skip the bytecode generation there instead of overwrite its value here. I did not investigate it fully, so I would also like to hear the opinions of others and you.

clover2123 · 2019-02-11T03:22:49Z

src/parser/esprima_cpp/esprima.cpp

@@ -4758,6 +4758,7 @@ class Parser : public gc {

 if (declarations.size() == 1 && this->matchKeyword(InKeyword)) {
 RefPtr<VariableDeclaratorNode> decl = declarations[0];
+ decl->setInitPtr(nullptr);
 // if (decl->init() && (decl.id.type === Syntax.ArrayPattern || decl.id.type === Syntax.ObjectPattern || this->context->strict)) {
 if (decl->init() && (decl->id()->type() == ArrayExpression || decl->id()->type() == ObjectExpression || this->context->strict)) {


After setting the m_init to null value, this if statement that checks some syntax errors will be always ignored. Instead of your approach, how about just not generating the bytecode of m_init in ForInStatementNode::generateStatementByteCode?

It can't really be skipped in ForInStatementNode::generateStatementByteCode, because the first time it generates the initial value for m_left, then the second time it will assign the result of the object enumeration to it (the value that will be used in the current iteration). Both are needed to work correctly, so we can't really skip it here. I'm leaning more towards the suggestion by @robertsipka . I'm currently investigating whether the if check here is necessary at all, we may be able to skip that altogether.

@pmarkee
As you commented, generating a store bytecode of m_left at the second time would be the proper target point. It may be possible not to generate a store bytecode of m_init in VariableDeclarationNode::generateStoreByteCode (instead calling a new function generateStoreByteCodeWithoutInit?).

pmarkee · 2019-02-15T09:48:38Z

@clover2123 @robertsipka I updated the pull request, I removed the if statement in VariableDeclarationNode::generateStoreByteCode() as suggested, we don't need that IMO.

robertsipka

Could you please also add the following test case from the issue #29?

for ( var id_0 = { toString : function ( ) {} } in Array.toString ) { }

LGTM (informally).

Fixes #29 Signed-off-by: Peter Marki [email protected]

pmarkee · 2019-02-18T13:23:06Z

Thanks @robertsipka , I have added the test case to the regression test.

clover2123

LGTM

yichoi

LGTM

pmarkee changed the title ~~Do not generate byte code for functions if it was already generated~~ [WIP] Do not generate byte code for functions if it was already generated Jan 10, 2019

robertsipka reviewed Jan 10, 2019

View reviewed changes

pmarkee changed the title ~~[WIP] Do not generate byte code for functions if it was already generated~~ [WIP] Do not generate byte code twice for global variables in for-in statements Jan 15, 2019

pmarkee changed the title ~~[WIP] Do not generate byte code twice for global variables in for-in statements~~ Fix bytecode generation for var statements in for-in statements Feb 1, 2019

robertsipka reviewed Feb 6, 2019

View reviewed changes

clover2123 reviewed Feb 11, 2019

View reviewed changes

robertsipka reviewed Feb 18, 2019

View reviewed changes

Fix bytecode generation for var statements in for-in statements

5b3d0e4

Fixes #29 Signed-off-by: Peter Marki [email protected]

clover2123 approved these changes Feb 19, 2019

View reviewed changes

yichoi approved these changes Feb 19, 2019

View reviewed changes

yichoi merged commit 250d95d into Samsung:master Feb 19, 2019

pmarkee deleted the issue-29 branch February 20, 2019 11:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bytecode generation for var statements in for-in statements #52

Fix bytecode generation for var statements in for-in statements #52

pmarkee commented Jan 10, 2019

robertsipka Jan 10, 2019 •

edited

Loading

clover2123 Jan 11, 2019

clover2123 Jan 11, 2019

clover2123 Jan 11, 2019 •

edited

Loading

pmarkee Jan 14, 2019 •

edited

Loading

pmarkee commented Jan 15, 2019

clover2123 commented Jan 17, 2019 •

edited

Loading

pmarkee commented Feb 1, 2019

robertsipka Feb 6, 2019

clover2123 Feb 11, 2019

pmarkee Feb 12, 2019

clover2123 Feb 13, 2019

pmarkee commented Feb 15, 2019

robertsipka left a comment

pmarkee commented Feb 18, 2019

clover2123 left a comment

yichoi left a comment

Fix bytecode generation for var statements in for-in statements #52

Fix bytecode generation for var statements in for-in statements #52

Conversation

pmarkee commented Jan 10, 2019

robertsipka Jan 10, 2019 • edited Loading

Choose a reason for hiding this comment

clover2123 Jan 11, 2019

Choose a reason for hiding this comment

clover2123 Jan 11, 2019

Choose a reason for hiding this comment

clover2123 Jan 11, 2019 • edited Loading

Choose a reason for hiding this comment

pmarkee Jan 14, 2019 • edited Loading

Choose a reason for hiding this comment

pmarkee commented Jan 15, 2019

clover2123 commented Jan 17, 2019 • edited Loading

pmarkee commented Feb 1, 2019

robertsipka Feb 6, 2019

Choose a reason for hiding this comment

clover2123 Feb 11, 2019

Choose a reason for hiding this comment

pmarkee Feb 12, 2019

Choose a reason for hiding this comment

clover2123 Feb 13, 2019

Choose a reason for hiding this comment

pmarkee commented Feb 15, 2019

robertsipka left a comment

Choose a reason for hiding this comment

pmarkee commented Feb 18, 2019

clover2123 left a comment

Choose a reason for hiding this comment

yichoi left a comment

Choose a reason for hiding this comment

robertsipka Jan 10, 2019 •

edited

Loading

clover2123 Jan 11, 2019 •

edited

Loading

pmarkee Jan 14, 2019 •

edited

Loading

clover2123 commented Jan 17, 2019 •

edited

Loading