Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduction to band miniapp prints negative flops when matrix size is equal to band size #965

Open
msimberg opened this issue Aug 31, 2023 · 0 comments
Labels

Comments

@msimberg
Copy link
Collaborator

For example:

[0]
[0] 0.00350707s -204.11GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[1]
[1] 0.000240678s -2974.21GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[2]
[2] 1.6993e-05s -42124.9GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[3]
[3] 1.4077e-05s -50850.9GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[4]
[4] 1.7594e-05s -40685.9GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU

This is due to this calculation being an approximation:

auto add_mul = 2. / 3. * n * n * n - n * n * b;
. According to @rasolca:

note that the correct calculation (i.e. not just the high order term) would return a NaN (as the flop are 0)

@msimberg msimberg added Type:Bug Something isn't working TODO:Task Priority:Low labels Aug 31, 2023
@msimberg msimberg changed the title Reduction to band miniapp prints negative flops when matrix size is equal to block size Reduction to band miniapp prints negative flops when matrix size is equal to band size Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

1 participant