Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix][TVMScript] Parser crash #13630

Merged
merged 3 commits into from
Dec 18, 2022

Conversation

lightzhan-intellif
Copy link
Contributor

This PR tries to fix the crash of parser when the old value of a var is an array but the new value is not. For example:

from tvm.script import tir as T
def func_wrapper(shape, dtype):
    @T.prim_func
    def test_case():
        a = T.alloc_buffer(shape, dtype=dtype)
    
    return test_case


if __name__ == "__main__":
    a = np.zeros((10, 10), dtype="int8")
    print(func_wrapper((256, 256), dtype="int8").script())

In the above code, there are two assignment to var 'a'. In the global scope, its value is a numpy array. But it is a Buffer in the prim function. There is a table named 'name2value' to track the value of vars like 'a' here.
When the parser wants to update its value, it will compare the value between the new and the old assignment. Here the problem comes. When we use '==' to compare an array with a value, the result is an array too, which can not be used as a condition of a if stmt directly. So, the code above will emit an error:

error: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
 --> /workspace/code_newest/tvm/private_test/test_meta_programming.py:16:9
    |  
 16 |          a = T.alloc_buffer(shape, dtype=dtype)
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This PR fixes this by change "==" to "is".

@tvm-bot
Copy link
Collaborator

tvm-bot commented Dec 16, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: bugfix, tvmscript See #10317 for details

Generated by tvm-bot

if __name__ == "__main__":
a = numpy.zeros((10, 10), dtype="int8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this line to line 49 (under the test_different_dtype_assignment_to_var)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very sorry that it can not in this case. If we do so, prim function can not capture the var 'a' because it is not a nonlocal variable of func test_case.

Comment on lines 153 to 154
if self.name2value[var] and self.name2value[var][-1] is value:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the issue! I believe either way might not be the most accurate, because it's possible that self.name2valuep[var] is a python integer or so, which cannot be compared using is. We might want to dispatch comparison according to different types

Copy link
Contributor Author

@lightzhan-intellif lightzhan-intellif Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion. I have done some trials in the python terminal according to your concern. Let's have a look:

# python3
Python 3.8.13 (default, Apr 19 2022, 00:53:22) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1 is 1
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
True
>>> a = 1
>>> a is 1
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
True
>>> a = 1
>>> b = 1
>>> a is b
True
>>> a = [1, 2, 3]
>>> b = 1
>>> a[0] is b
True
>>> b = [1, 2, 3]
>>> a [0] is b[1]
False

According to the above output, we can find that there will be a warning if we use literal directly, but here it is a variable/list/dict which in your concern contains a literal. It looks like that python differentiates literal from variables with literal value. In our case, it belongs to the latter. So maybe no problem here with "is".

There might be some other scenarios I didn't cover, feel free to point out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the is operator checks reference equality rather than value equality. For integers, it will just check equality, so @lightzhan-intellif is correct. Whether it's preferred style is another question

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have particular opinion on which style we should go for, but just wanted to point out the implication of the switch:

  • is checks referential/pointer equality, or in python's term, identity, where it returns True only when id(lhs) == id(rhs). It could depend on certain underlying implementation of the system, for example:
>>> a = 257
>>> b = 257
>>> a is b
False
>>> a = 256
>>> b = 256
>>> a is b
True
  • == checks equality that could be potentially overloaded, for TVM objects, it's using TVM's address comparison .same_as() rather than python's builtin id() method which is used in is operator. However

The implication of switching from == to is means that it bypasses TVM's .same_as() method, which at the moment I am not quite certain is suitable for broad usecases.

Therefore, how about we do the following: if lhs and rhs are numpy arrays, then we use numpy-specific behavior (e.g. elementwise equality), but otherwise we still use ==.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I have updated the code.

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please fix the lint and I'm happy to get it in

@lightzhan-intellif
Copy link
Contributor Author

LGTM! Please fix the lint and I'm happy to get it in

Done

@junrushao junrushao merged commit 4096548 into apache:main Dec 18, 2022
@junrushao
Copy link
Member

Thanks @lightzhan-intellif @slyubomirsky @Hzfengsy for the discussion!

fzi-peccia pushed a commit to fzi-peccia/tvm that referenced this pull request Mar 27, 2023
This PR tries to fix the crash of parser when the old value of a var is an array but the new value is not. For example:

```python
from tvm.script import tir as T
def func_wrapper(shape, dtype):
    @T.prim_func
    def test_case():
        a = T.alloc_buffer(shape, dtype=dtype)
    
    return test_case


if __name__ == "__main__":
    a = np.zeros((10, 10), dtype="int8")
    print(func_wrapper((256, 256), dtype="int8").script())
```

In the above code, there are two assignment to var 'a'. In the global scope, its value is a numpy array. But it is a Buffer in the prim function. There is a table named 'name2value' to track the value of vars like 'a' here.
When the parser wants to update its value, it will compare the value between the new and the old assignment. Here the problem comes. When we use '==' to compare an array with a value, the result is an array too, which can not be used as a condition of a if stmt directly. So, the code above will emit an error:

```shell
error: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
 --> /workspace/code_newest/tvm/private_test/test_meta_programming.py:16:9
    |  
 16 |          a = T.alloc_buffer(shape, dtype=dtype)
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

This PR fixes this by change "==" to "is".

Co-authored-by: lightzhan-intellif <[email protected]>
mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023
This PR tries to fix the crash of parser when the old value of a var is an array but the new value is not. For example:

```python
from tvm.script import tir as T
def func_wrapper(shape, dtype):
    @T.prim_func
    def test_case():
        a = T.alloc_buffer(shape, dtype=dtype)
    
    return test_case


if __name__ == "__main__":
    a = np.zeros((10, 10), dtype="int8")
    print(func_wrapper((256, 256), dtype="int8").script())
```

In the above code, there are two assignment to var 'a'. In the global scope, its value is a numpy array. But it is a Buffer in the prim function. There is a table named 'name2value' to track the value of vars like 'a' here.
When the parser wants to update its value, it will compare the value between the new and the old assignment. Here the problem comes. When we use '==' to compare an array with a value, the result is an array too, which can not be used as a condition of a if stmt directly. So, the code above will emit an error:

```shell
error: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
 --> /workspace/code_newest/tvm/private_test/test_meta_programming.py:16:9
    |  
 16 |          a = T.alloc_buffer(shape, dtype=dtype)
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

This PR fixes this by change "==" to "is".

Co-authored-by: lightzhan-intellif <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants