Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result of floating point computations does not match between AVX2 and AVX512 #2782

Open
petrsm opened this issue Mar 4, 2024 · 7 comments
Open
Labels

Comments

@petrsm
Copy link

petrsm commented Mar 4, 2024

Hi guys !
We have complex floating point ISPC code, which produces slightly different results for
avx2-i32x8 vs avx512skx-x16 targets. Both targets have exactly same command line and
we are using "--math-lib=fast".

Is this behavior expected ? Or output should be exactly the same ?

Unfortunately, I have not isolated the issue yet - I am just checking what is expected state.

Thanks !

@dbabokin
Copy link
Collaborator

dbabokin commented Mar 4, 2024

It's expected, at least with --math-lib=fast. Fast implementation of libs do differ on different ISAs. It's very difficult / impossible to avoid not sacrificing performance.

@emoon
Copy link

emoon commented Mar 4, 2024

Yes, the compiler has much more leeway to reorder operations when doing fast-math so that means that lower fractions of floating point values usually won't add up between targets.

@petrsm
Copy link
Author

petrsm commented Mar 4, 2024

We are not using --fast-math, but 'fast' library of math functions --math-lib.

However, even if I use --math-lib=default, I still get different results from AVX512 path.

Also, I replaced rcp(x) by '1 / x' and rsqrt(x) by '1 / sqrt(x)' and issue persists.

Is this still expected under above conditions ?

@dbabokin
Copy link
Collaborator

dbabokin commented Mar 4, 2024

Try --math-lib=system to see if this has an effect - it should be slow, but may help you isolate the problem. "default" library depends on ISA quite heavily and it's not designed to be bit-reproducible across ISAs.

Removing rcp() and rsqrt() should make things more stable.

If it's not a library, then it could be slightly different sequence of FMA instructions due to optimization. You can try --opt=disable-fma to experiment.

@petrsm
Copy link
Author

petrsm commented Mar 4, 2024

Great tips ! Thanks !

@nurmukhametov
Copy link
Collaborator

@petrsm, is your project open-source?

@petrsm
Copy link
Author

petrsm commented Apr 5, 2024

Unfortunately it is not. At the end, by avoiding usage of rsqrt() and rcp() I managed to make results of computations to match between AVX512 a AVX2, except of few specific functions. I ran out of time for investigation, so for those which does not match, I always use AVX2.

@pbrubaker pbrubaker added the Bugs label Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

5 participants