Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas testsuite with numpy 2.0.0rc1 fails on numexpr #483

Open
bnavigator opened this issue Apr 19, 2024 · 5 comments
Open

pandas testsuite with numpy 2.0.0rc1 fails on numexpr #483

bnavigator opened this issue Apr 19, 2024 · 5 comments

Comments

@bnavigator
Copy link

I'm currently testing numpy 2 on the the openSUSE python ecosystem. I notice the pandas test suite failing when numpy 2.0.0rc1 is installed instead of 1.26.4:

[  681s] =================================== FAILURES ===================================
[  681s] _ TestTypeCasting.test_binop_typecasting[numexpr-python-left_right0-float64-/] _
[  681s] [gw1] linux -- Python 3.11.8 /usr/bin/python3.11
[  681s] 
[  681s] self = <pandas.tests.computation.test_eval.TestTypeCasting object at 0x7ff16c7666d0>
[  681s] engine = 'numexpr', parser = 'python', op = '/', dt = <class 'numpy.float64'>
[  681s] left_right = ('df', '3')
[  681s] 
[  681s]     @pytest.mark.parametrize("op", ["+", "-", "*", "**", "/"])
[  681s]     # maybe someday... numexpr has too many upcasting rules now
[  681s]     # chain(*(np.core.sctypes[x] for x in ['uint', 'int', 'float']))
[  681s]     @pytest.mark.parametrize("dt", [np.float32, np.float64])
[  681s]     @pytest.mark.parametrize("left_right", [("df", "3"), ("3", "df")])
[  681s]     def test_binop_typecasting(self, engine, parser, op, dt, left_right):
[  681s]         df = DataFrame(np.random.default_rng(2).standard_normal((5, 3)), dtype=dt)
[  681s]         left, right = left_right
[  681s]         s = f"{left} {op} {right}"
[  681s] >       res = pd.eval(s, engine=engine, parser=parser)
[  681s] 
[  681s] pandas/tests/computation/test_eval.py:756: 
[  681s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  681s] pandas/core/computation/eval.py:357: in eval
[  681s]     ret = eng_inst.evaluate()
[  681s] pandas/core/computation/engines.py:81: in evaluate
[  681s]     res = self._evaluate()
[  681s] pandas/core/computation/engines.py:121: in _evaluate
[  681s]     return ne.evaluate(s, local_dict=scope)
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:977: in evaluate
[  681s]     raise e
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:874: in validate
[  681s]     _names_cache[expr_key] = getExprNames(ex, context, sanitize=sanitize)
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:723: in getExprNames
[  681s]     ex = stringToExpression(text, {}, context, sanitize)
[  681s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  681s] 
[  681s] s = '(df) / (np.float64(3.0))', types = {}
[  681s] context = {'optimization': 'aggressive', 'truediv': False}, sanitize = True
[  681s] 
[  681s]     def stringToExpression(s, types, context, sanitize: bool=True):
[  681s]         """Given a string, convert it to a tree of ExpressionNode's.
[  681s]         """
[  681s]         # sanitize the string for obvious attack vectors that NumExpr cannot
[  681s]         # parse into its homebrew AST. This is to protect the call to `eval` below.
[  681s]         # We forbid `;`, `:`. `[` and `__`, and attribute access via '.'.
[  681s]         # We cannot ban `.real` or `.imag` however...
[  681s]         # We also cannot ban `.\d*j`, where `\d*` is some digits (or none), e.g. 1.5j, 1.j
[  681s]         if sanitize:
[  681s]             no_whitespace = re.sub(r'\s+', '', s)
[  681s]             skip_quotes = re.sub(r'(\'[^\']*\')', '', no_whitespace)
[  681s]             if _blacklist_re.search(skip_quotes) is not None:
[  681s] >               raise ValueError(f'Expression {s} has forbidden control characters.')
[  681s] E               ValueError: Expression (df) / (np.float64(3.0)) has forbidden control characters.
[  681s] 
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:283: ValueError
...
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-python-left_right0-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-python-left_right1-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-pandas-left_right0-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-pandas-left_right1-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestOperations::test_simple_arith_ops[numexpr-python]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestOperations::test_simple_arith_ops[numexpr-pandas]
@FrancescAlted
Copy link
Contributor

This is a consequence of the sanitizers that numexpr implemented a few months ago. In general, it is considered not a good idea to call arbitrary functions inside numexpr expressions, so we encourage to rewrite that test as:

In [11]: ne.evaluate('(df) / b', {'b': np.float64(3.0)})
Out[11]: array([0.33333333, 0.66666667, 1.        ])

Copy link

Message to comment on stale issues. If none provided, will not mark issues stale

@github-actions github-actions bot added the Stale label Jun 30, 2024
@bnavigator
Copy link
Author

I am not sure why the pandas bug is still not picked up yet. Numpy as been released for a few weeks already and the regular pandas CI should have encountered the failure by now.

@github-actions github-actions bot removed the Stale label Jul 7, 2024
Copy link

github-actions bot commented Sep 5, 2024

Message to comment on stale issues. If none provided, will not mark issues stale

@github-actions github-actions bot added the Stale label Sep 5, 2024
@bnavigator
Copy link
Author

This was fixed in pandas main and is being backported in pandas-dev/pandas#59513

@github-actions github-actions bot removed the Stale label Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants