Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use flatten_args for _impl_caches in templates #9457

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

dlee992
Copy link
Contributor

@dlee992 dlee992 commented Feb 22, 2024

Fixes #9440
Fixes #9462

@dlee992 dlee992 marked this pull request as draft February 22, 2024 02:56
@dlee992
Copy link
Contributor Author

dlee992 commented Feb 23, 2024

Numba compilation still has at least 2 issues:

  1. sometimes, if the return type of a jitted/overloaded function is Tuple, numba forcily doesn't cache this function into its cache dict.
  2. different callers call the same overloaded function with the same type signature, the overloaded function still compiles twice, since they have different compilation flags, e.g., no_cpython_wrapper=True or False, and fields in flags are in different order. This makes sense in some cases.

In this PR, I only solved one issue (different from above 2): if a overloaded function is compiled in caller's typeinfer pass, then I can make sure it won't recompile in caller's native lowering pass. (I hope so, but not pretty sure)

@dlee992
Copy link
Contributor Author

dlee992 commented Feb 27, 2024

An example to show why we need this PR:

from os import environ

environ["NUMBA_CHROME_TRACE"] = "3.json"
environ["NUMBA_DUMP_LLVM"] = "1"

from numba import njit
from numba.core import types
from numba.typed import Dict


@njit
def goo(k, j):
    d0 = Dict.empty(types.int64, types.int64)
    d = Dict.empty(types.int64, value_type=types.int64)
    d1 = Dict.empty(key_type=types.int64, value_type=types.int64)
    d[1] = 1
    d[2] = 2
    d[3] = 3
    _ = d.get(k)
    d.pop(k)
    d.pop(j, None)
    del d[3]

@njit
def joo(k):
    d = Dict.empty(types.int64, value_type=types.int64)
    d[1] = 1
    return d.pop(k, None)

print(goo(1, 2))
print(joo(0))

without this patch, we compile these modules:

---------LLVM DUMP <function descriptor 'impl_new_dict.<locals>.imp$3'>---------
-------LLVM DUMP <function descriptor 'typeddict_empty.<locals>.impl$2'>--------
-------LLVM DUMP <function descriptor 'typeddict_empty.<locals>.impl$4'>--------
-------LLVM DUMP <function descriptor 'typeddict_empty.<locals>.impl$5'>--------
----------LLVM DUMP <function descriptor 'ol_hasattr.<locals>.impl$9'>----------
--------LLVM DUMP <function descriptor 'ol_getattr_2.<locals>.impl$10'>---------
--------------LLVM DUMP <function descriptor 'process_return$13'>---------------
----------------LLVM DUMP <function descriptor '_long_impl$14'>-----------------
----------LLVM DUMP <function descriptor 'int_hash.<locals>.impl$12'>-----------
--------LLVM DUMP <function descriptor 'ol_defer_hash.<locals>.impl$11'>--------
--------LLVM DUMP <function descriptor 'hash_overload.<locals>.impl$7'>---------
---------LLVM DUMP <function descriptor 'impl_setitem.<locals>.impl$6'>---------
----------LLVM DUMP <function descriptor 'impl_get.<locals>.impl$16'>-----------
----------LLVM DUMP <function descriptor 'impl_pop.<locals>.impl$17'>-----------
----------LLVM DUMP <function descriptor 'impl_pop.<locals>.impl$18'>-----------
----------LLVM DUMP <function descriptor 'impl_pop.<locals>.impl$20'>-----------
----------LLVM DUMP <function descriptor 'impl_pop.<locals>.impl$21'>-----------
--------LLVM DUMP <function descriptor 'impl_delitem.<locals>.impl$19'>---------
-------LLVM DUMP <function descriptor 'typeddict_empty.<locals>.impl$22'>-------
----------LLVM DUMP <function descriptor 'impl_get.<locals>.impl$23'>-----------
----------LLVM DUMP <function descriptor 'impl_pop.<locals>.impl$24'>-----------
--------------------LLVM DUMP <function descriptor 'goo$1'>---------------------
--------------------LLVM DUMP <function descriptor 'joo$25'>--------------------

with this patch, we compile these modules:

---------LLVM DUMP <function descriptor 'impl_new_dict.<locals>.imp$3'>---------
-------LLVM DUMP <function descriptor 'typeddict_empty.<locals>.impl$2'>--------
----------LLVM DUMP <function descriptor 'ol_hasattr.<locals>.impl$7'>----------
---------LLVM DUMP <function descriptor 'ol_getattr_2.<locals>.impl$8'>---------
--------------LLVM DUMP <function descriptor 'process_return$11'>---------------
----------------LLVM DUMP <function descriptor '_long_impl$12'>-----------------
----------LLVM DUMP <function descriptor 'int_hash.<locals>.impl$10'>-----------
--------LLVM DUMP <function descriptor 'ol_defer_hash.<locals>.impl$9'>---------
--------LLVM DUMP <function descriptor 'hash_overload.<locals>.impl$5'>---------
---------LLVM DUMP <function descriptor 'impl_setitem.<locals>.impl$4'>---------
----------LLVM DUMP <function descriptor 'impl_get.<locals>.impl$14'>-----------
----------LLVM DUMP <function descriptor 'impl_pop.<locals>.impl$15'>-----------
----------LLVM DUMP <function descriptor 'impl_pop.<locals>.impl$16'>-----------
--------LLVM DUMP <function descriptor 'impl_delitem.<locals>.impl$17'>---------
--------------------LLVM DUMP <function descriptor 'goo$1'>---------------------
--------------------LLVM DUMP <function descriptor 'joo$18'>--------------------

The diff is without it, our main will compile typeddict_empty 4 times and impl_pop 5 times. with this patch, it compiles typeddict_empty 1 time and impl_pop twice (for different default types, Omitted(default=None) and types.none)
cc @guilhermeleobas

@gmarkall
Copy link
Member

/azp run

Copy link
Contributor

Azure Pipelines successfully started running 1 pipeline(s).

@@ -1952,7 +1952,7 @@ def getitem(a, a_i):
raise TypeError("np.random.choice() first argument should be "
"int or array, got %s" % (a,))

if size in (None, types.none):
if size in (None, types.none) or isinstance(size, types.Omitted):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a proper review, but this ought to be cgutils.is_nonelike(size)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also notice this API to check it, will refactor.

@dlee992
Copy link
Contributor Author

dlee992 commented Feb 29, 2024

I didn't change this file, but CI reports the flake8 errors in it. Emm...

numba/tests/test_unicode.py:2670:27: E226 missing whitespace around arithmetic operator

btw, the last real failure is about overriding @overload flags in a test, try to understand the test case first..

@dlee992
Copy link
Contributor Author

dlee992 commented Mar 4, 2024

@sklam I'm not sure if I choose the right way to fix them. Maybe the real fix should happen at the start of "funny" things, rather than at the end of them. I choose the latter way in this PR.

The real issue is the inconsistency between _impl_caches and the underlying compilation for the same function. The former has many issues, while the latter is always "perfect".

Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "review" is really for developers. The PR points out several problems in the overload template code. Also, it would be better if we don't have to handle None/NoneType/Omitted as this PR is fixing.

Comment on lines 705 to 708
dict_flags = dict(flags.options) if flags is not None else {}
for key, value in self._jit_options.items():
if key in dict_flags:
dict_flags["key"] = value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dev note: check why flags are not correct; e.g. no_cpython_wrapper

Comment on lines 710 to 722
# FIXME: what's the real thing need to do is adding Ommitted to the args
# check the pyfunc signature, if exist some kwargs, e.g., default=None,
# check whether kws specify them, if not,
# please add Omitted(default=None) to args
ov_sig = inspect.signature(self._overload_func)
if len(args) + len(kws) < len(ov_sig.parameters):
for param in list(ov_sig.parameters.values())[len(args):]:
name = param.name
default = param.default
if default != inspect.Parameter.empty and name not in kws:
# add numba type of this param into kws
# it must be an Omitted type with a default value
kws[name] = types.Omitted(default)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dev note: we should normalize signatures

Comment on lines +847 to +851
if sig is not None:
flatten_args = sig.args

if cache_key is not None:
self._impl_cache[cache_key] = disp, flatten_args
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dev note: needs investigation. sometimes sig.args is different from the given args.

@sklam sklam added the 4 - Waiting on reviewer Waiting for reviewer to respond to author label Mar 12, 2024
@dlee992
Copy link
Contributor Author

dlee992 commented Mar 28, 2024

interesting failure:

Traceback (most recent call last):
  File "/home/vsts/work/1/s/numba/tests/test_compiler_flags.py", line 40, in test_fastmath_in_overload
    self.assertEqual(b, "Has fastmath")
AssertionError: 'No fastmath' != 'Has fastmath'
- No fastmath
? ^^
+ Has fastmath
? ^^^

It could inspire me how to fix flags more decently.

@dlee992
Copy link
Contributor Author

dlee992 commented Mar 28, 2024

added a naive release doc and fixed the failed test case.

TODO: add some new tests to check if this PR can deduplicate the expected unnecessary compilations.

@dlee992 dlee992 marked this pull request as ready for review March 28, 2024 21:18
@dlee992
Copy link
Contributor Author

dlee992 commented Mar 28, 2024

this failure (numba.numba (Linux py310_np123)) is a CI infra issue, not code issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress 4 - Waiting on reviewer Waiting for reviewer to respond to author
Projects
None yet
4 participants