give primitive types a faster key&value compare, copy, and zero methods in TypedDict #9520

dlee992 · 2024-04-01T22:01:23Z

fixes #9519

guilhermeleobas

Thanks for your contribution, @dlee992. Ping me once the PR is ready and I can review it.

numba/core/datamodel/models.py

numba/typed/typedobjectutils.py

dlee992 · 2024-04-02T14:50:10Z

Thanks @guilhermeleobas !

I'm not familiar with IR Builder stuff, your comments are valuable for me! Will address them one-by-one.

… default key_equal back

dlee992 · 2024-04-02T19:28:34Z

Off topic: I often saw

CondaHTTPError: HTTP 524 CONNECTION FAILED for url <[https://conda.anaconda.org/numba/label/dev/linux-64/llvmlite-0.43.0dev0-py310he1b5a44_12.tar.bz2>](https://conda.anaconda.org/numba/label/dev/linux-64/llvmlite-0.43.0dev0-py310he1b5a44_12.tar.bz2%3E)

this network error in CI recently, but normal users don't have permissions to rerun this single job.

Perhaps someone can add retries=2 in CI config if detecting this kind of HTTP Error?

dlee992 · 2024-04-02T20:02:44Z

I think this is ready for review.
All failing jobs are Before install or Build error related to HTTP errors and llvmlite.
All other jobs are passed.

Will add a release note soon and post some perf test and result.

dlee992 · 2024-04-02T22:55:18Z

Here is a mini-benchmark:

import os
import random
import time

from numba import njit, typed, types

N = 100000

@njit
def koo():
    keys = [i for i in range(N)]
    values = [random.randint(1, 100) for _ in range(N)]
    d = dict(zip(keys, values))

    for _ in range(N):
        d[random.randint(N+1, N+N)] = random.randint(1, 100)

    tot = 0
    for _ in range(N):
        tot += d[random.randint(0, 100)]

    # for i in range(N):
    #     del d[i]

    return tot

koo()
start = time.time()
for _ in range(100):
    koo()
print(time.time() - start)

On my machine, the ratio is 0.88

without this PR, its time is 0.7227180004119873
with this PR, its time is 0.6371762752532959

But I found a more profound inherent limitation in current numba impl: in theory, I want to give a typed-based operation dispatching for primitive types and container types (which even include 2 kinds, with meminfo, and without meminfo).

In the first thought, I want to implement this template in C, but I feel it's not easy. So in this PR, I tried to implement this template for primitive types with llvmlite IR builder. However, the actual call chain is

LLVM IR function (e.g., `koo`) 
    --> Numba C Dict Implementation (e.g., `dict_lookup`)
        --> LLVM IR function (generated by this PR, e.g., `key_equal`)

The two call-frames make the execution slower, given key_equal with primitive types is pretty fast.
In fact, if I put a specialized key_equal in C Dict implementation, it can be faster.

Any idea about how to implement this template stuff for primitive types in C side and can easily interact with Numba-generated LLVM modules/functions?

guilhermeleobas · 2024-04-03T00:26:34Z

Hi @dlee992, I can review this by the end of the week. Don't worry about the CI failures that are not directly related to your PR.

dlee992 · 2024-04-03T14:58:21Z

@guilhermeleobas, cool! All checks have passed this time.

A concern: as I mentioned in a previous comment a more profound inherent limitation, do you think PIXIE can overcome this limitation, given it would emit&inject LLVM IR of C/C++ code into numba generated LLVM module, which can bring more optimisation chance?

Updates after checking the demo notebook in PIXIE repo: I feel PIXIE also can't solve this issue very well, but not pretty sure.

guilhermeleobas

Hi @dlee992, your code seems to be ok, and it gives speedup in the example you provided. I have some comments and a couple of questions. Ping me once you're done and I can review it again.

numba/core/datamodel/models.py

numba/typed/dictobject.py

numba/typed/typedobjectutils.py

dlee992 · 2024-04-05T20:09:27Z

not sure what changes, but the result of the mini-benchmark becomes 0.7 vs 1.4, speed up is 2x.

Can we put this into 0.60rc?

gentle ping @guilhermeleobas , can review this again.

BTW, I noticed related key_equal, key_copy and val_copy will appear in the LLVM IR module many times. Not sure if it's expected.

dlee992 · 2024-04-22T16:38:58Z

Looks like we can't put this into 0.60. Then perhaps put it into 0.61rc?

guilhermeleobas

Hi @dlee992, sorry for the delay as I was out for some days. The changes you proposed looks good to me, but want to make sure we don't introduce any regression fixed in #6402. Could you undo the changes related to container_element_type?

numba/typed/typedobjectutils.py

dlee992 · 2024-05-03T15:24:12Z

I believe this PR has been reviewed from you in a very quick speed! Compared to my other PRs. : )

guilhermeleobas · 2024-05-03T21:12:30Z

I believe this PR has been reviewed from you in a very quick speed! Compared to my other PRs. : )

If you need review in any other PR, feel free to ping me.

guilhermeleobas · 2024-05-03T21:47:23Z

Since this touches part of the C code, it would be good if anyone else could review. Meanwhile, do you have the number of how fast these functions are? If so, could you also share the code you're using to benchmark these functions?

dlee992 · 2024-05-03T21:50:18Z

Here is a mini-benchmark (I guess you forgot this?) #9520 (comment)

guilhermeleobas · 2024-05-03T21:52:06Z

Here is a mini-benchmark (I guess you forgot this?) #9520 (comment)

Oops, my bad. I'll try locally.

edit: Here are the numbers I get locally with your benchmark:

main: ~0.75
your branch: ~0.61

dlee992 added 5 commits April 1, 2024 16:59

first try to fix 9519

4d0fa0a

fix AttributeError: 'DoubleType' object has no attribute 'width'

790da56

check width in fe_type and be_type, fix the typo

643aee9

use get_constant_generic to obtain a zero

fe7ef3c

use meaningful argument name

7f30e44

guilhermeleobas reviewed Apr 2, 2024

View reviewed changes

guilhermeleobas added the 2 - In Progress label Apr 2, 2024

dlee992 added 4 commits April 2, 2024 13:36

typo fix and use get_constant_generic

54521ae

only generate specialized functions for PrimitiveModel, and bring the…

9880a52

… default key_equal back

some refactor

0fe0d7c

typo fix one more

48ba7e5

dlee992 added 2 commits April 2, 2024 14:40

use f-format

e926696

put them into one line

bb1c21f

dlee992 marked this pull request as ready for review April 2, 2024 20:00

dlee992 marked this pull request as draft April 2, 2024 23:13

add a note

7560394

dlee992 marked this pull request as ready for review April 3, 2024 14:58

guilhermeleobas added 3 - Ready for Review and removed 2 - In Progress labels Apr 4, 2024

guilhermeleobas requested changes Apr 4, 2024

View reviewed changes

guilhermeleobas added 4 - Waiting on author Waiting for author to respond to review Effort - medium Medium size effort needed and removed 3 - Ready for Review labels Apr 5, 2024

address comments

93a66b8

gmarkall added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels Apr 8, 2024

guilhermeleobas added this to the 0.61.0-rc1 milestone Apr 29, 2024

guilhermeleobas requested changes Apr 29, 2024

View reviewed changes

numba/typed/typedobjectutils.py Outdated Show resolved Hide resolved

address comment: bring container_element_type back

5d75516

guilhermeleobas added 4 - Waiting on second reviewer Patch needs a second reviewer. and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels May 3, 2024

guilhermeleobas added the 4 - Waiting on author Waiting for author to respond to review label May 3, 2024

guilhermeleobas approved these changes May 3, 2024

View reviewed changes

guilhermeleobas removed the 4 - Waiting on author Waiting for author to respond to review label May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

give primitive types a faster key&value compare, copy, and zero methods in TypedDict #9520

give primitive types a faster key&value compare, copy, and zero methods in TypedDict #9520

dlee992 commented Apr 1, 2024

guilhermeleobas left a comment

dlee992 commented Apr 2, 2024

dlee992 commented Apr 2, 2024

dlee992 commented Apr 2, 2024

dlee992 commented Apr 2, 2024 •

edited

guilhermeleobas commented Apr 3, 2024

dlee992 commented Apr 3, 2024 •

edited

guilhermeleobas left a comment

dlee992 commented Apr 5, 2024 •

edited

dlee992 commented Apr 22, 2024

guilhermeleobas left a comment

dlee992 commented May 3, 2024

guilhermeleobas commented May 3, 2024

guilhermeleobas commented May 3, 2024

dlee992 commented May 3, 2024

guilhermeleobas commented May 3, 2024 •

edited

give primitive types a faster key&value compare, copy, and zero methods in TypedDict #9520

Are you sure you want to change the base?

give primitive types a faster key&value compare, copy, and zero methods in TypedDict #9520

Conversation

dlee992 commented Apr 1, 2024

guilhermeleobas left a comment

Choose a reason for hiding this comment

dlee992 commented Apr 2, 2024

dlee992 commented Apr 2, 2024

dlee992 commented Apr 2, 2024

dlee992 commented Apr 2, 2024 • edited

guilhermeleobas commented Apr 3, 2024

dlee992 commented Apr 3, 2024 • edited

guilhermeleobas left a comment

Choose a reason for hiding this comment

dlee992 commented Apr 5, 2024 • edited

dlee992 commented Apr 22, 2024

guilhermeleobas left a comment

Choose a reason for hiding this comment

dlee992 commented May 3, 2024

guilhermeleobas commented May 3, 2024

guilhermeleobas commented May 3, 2024

dlee992 commented May 3, 2024

guilhermeleobas commented May 3, 2024 • edited

dlee992 commented Apr 2, 2024 •

edited

dlee992 commented Apr 3, 2024 •

edited

dlee992 commented Apr 5, 2024 •

edited

guilhermeleobas commented May 3, 2024 •

edited