Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas testsuite with numpy 2.0.0rc1 fails on numexpr #483

Open
bnavigator opened this issue Apr 19, 2024 · 1 comment
Open

pandas testsuite with numpy 2.0.0rc1 fails on numexpr #483

bnavigator opened this issue Apr 19, 2024 · 1 comment

Comments

@bnavigator
Copy link

I'm currently testing numpy 2 on the the openSUSE python ecosystem. I notice the pandas test suite failing when numpy 2.0.0rc1 is installed instead of 1.26.4:

[  681s] =================================== FAILURES ===================================
[  681s] _ TestTypeCasting.test_binop_typecasting[numexpr-python-left_right0-float64-/] _
[  681s] [gw1] linux -- Python 3.11.8 /usr/bin/python3.11
[  681s] 
[  681s] self = <pandas.tests.computation.test_eval.TestTypeCasting object at 0x7ff16c7666d0>
[  681s] engine = 'numexpr', parser = 'python', op = '/', dt = <class 'numpy.float64'>
[  681s] left_right = ('df', '3')
[  681s] 
[  681s]     @pytest.mark.parametrize("op", ["+", "-", "*", "**", "/"])
[  681s]     # maybe someday... numexpr has too many upcasting rules now
[  681s]     # chain(*(np.core.sctypes[x] for x in ['uint', 'int', 'float']))
[  681s]     @pytest.mark.parametrize("dt", [np.float32, np.float64])
[  681s]     @pytest.mark.parametrize("left_right", [("df", "3"), ("3", "df")])
[  681s]     def test_binop_typecasting(self, engine, parser, op, dt, left_right):
[  681s]         df = DataFrame(np.random.default_rng(2).standard_normal((5, 3)), dtype=dt)
[  681s]         left, right = left_right
[  681s]         s = f"{left} {op} {right}"
[  681s] >       res = pd.eval(s, engine=engine, parser=parser)
[  681s] 
[  681s] pandas/tests/computation/test_eval.py:756: 
[  681s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  681s] pandas/core/computation/eval.py:357: in eval
[  681s]     ret = eng_inst.evaluate()
[  681s] pandas/core/computation/engines.py:81: in evaluate
[  681s]     res = self._evaluate()
[  681s] pandas/core/computation/engines.py:121: in _evaluate
[  681s]     return ne.evaluate(s, local_dict=scope)
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:977: in evaluate
[  681s]     raise e
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:874: in validate
[  681s]     _names_cache[expr_key] = getExprNames(ex, context, sanitize=sanitize)
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:723: in getExprNames
[  681s]     ex = stringToExpression(text, {}, context, sanitize)
[  681s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  681s] 
[  681s] s = '(df) / (np.float64(3.0))', types = {}
[  681s] context = {'optimization': 'aggressive', 'truediv': False}, sanitize = True
[  681s] 
[  681s]     def stringToExpression(s, types, context, sanitize: bool=True):
[  681s]         """Given a string, convert it to a tree of ExpressionNode's.
[  681s]         """
[  681s]         # sanitize the string for obvious attack vectors that NumExpr cannot
[  681s]         # parse into its homebrew AST. This is to protect the call to `eval` below.
[  681s]         # We forbid `;`, `:`. `[` and `__`, and attribute access via '.'.
[  681s]         # We cannot ban `.real` or `.imag` however...
[  681s]         # We also cannot ban `.\d*j`, where `\d*` is some digits (or none), e.g. 1.5j, 1.j
[  681s]         if sanitize:
[  681s]             no_whitespace = re.sub(r'\s+', '', s)
[  681s]             skip_quotes = re.sub(r'(\'[^\']*\')', '', no_whitespace)
[  681s]             if _blacklist_re.search(skip_quotes) is not None:
[  681s] >               raise ValueError(f'Expression {s} has forbidden control characters.')
[  681s] E               ValueError: Expression (df) / (np.float64(3.0)) has forbidden control characters.
[  681s] 
[  681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:283: ValueError
...
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-python-left_right0-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-python-left_right1-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-pandas-left_right0-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-pandas-left_right1-float64-/]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestOperations::test_simple_arith_ops[numexpr-python]
[  684s] FAILED pandas/tests/computation/test_eval.py::TestOperations::test_simple_arith_ops[numexpr-pandas]
@FrancescAlted
Copy link
Contributor

This is a consequence of the sanitizers that numexpr implemented a few months ago. In general, it is considered not a good idea to call arbitrary functions inside numexpr expressions, so we encourage to rewrite that test as:

In [11]: ne.evaluate('(df) / b', {'b': np.float64(3.0)})
Out[11]: array([0.33333333, 0.66666667, 1.        ])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants