Result of floating point computations does not match between AVX2 and AVX512 #2782

petrsm · 2024-03-04T15:43:22Z

Hi guys !
We have complex floating point ISPC code, which produces slightly different results for
avx2-i32x8 vs avx512skx-x16 targets. Both targets have exactly same command line and
we are using "--math-lib=fast".

Is this behavior expected ? Or output should be exactly the same ?

Unfortunately, I have not isolated the issue yet - I am just checking what is expected state.

Thanks !

dbabokin · 2024-03-04T17:18:59Z

It's expected, at least with --math-lib=fast. Fast implementation of libs do differ on different ISAs. It's very difficult / impossible to avoid not sacrificing performance.

emoon · 2024-03-04T17:33:50Z

Yes, the compiler has much more leeway to reorder operations when doing fast-math so that means that lower fractions of floating point values usually won't add up between targets.

petrsm · 2024-03-04T19:32:21Z

We are not using --fast-math, but 'fast' library of math functions --math-lib.

However, even if I use --math-lib=default, I still get different results from AVX512 path.

Also, I replaced rcp(x) by '1 / x' and rsqrt(x) by '1 / sqrt(x)' and issue persists.

Is this still expected under above conditions ?

dbabokin · 2024-03-04T19:54:40Z

Try --math-lib=system to see if this has an effect - it should be slow, but may help you isolate the problem. "default" library depends on ISA quite heavily and it's not designed to be bit-reproducible across ISAs.

Removing rcp() and rsqrt() should make things more stable.

If it's not a library, then it could be slightly different sequence of FMA instructions due to optimization. You can try --opt=disable-fma to experiment.

petrsm · 2024-03-04T20:05:14Z

Great tips ! Thanks !

nurmukhametov · 2024-04-05T13:20:30Z

@petrsm, is your project open-source?

petrsm · 2024-04-05T13:37:18Z

Unfortunately it is not. At the end, by avoiding usage of rsqrt() and rcp() I managed to make results of computations to match between AVX512 a AVX2, except of few specific functions. I ran out of time for investigation, so for those which does not match, I always use AVX2.

pbrubaker added the Bugs label Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Result of floating point computations does not match between AVX2 and AVX512 #2782

Result of floating point computations does not match between AVX2 and AVX512 #2782

petrsm commented Mar 4, 2024

dbabokin commented Mar 4, 2024

emoon commented Mar 4, 2024

petrsm commented Mar 4, 2024

dbabokin commented Mar 4, 2024

petrsm commented Mar 4, 2024

nurmukhametov commented Apr 5, 2024

petrsm commented Apr 5, 2024

Result of floating point computations does not match between AVX2 and AVX512 #2782

Result of floating point computations does not match between AVX2 and AVX512 #2782

Comments

petrsm commented Mar 4, 2024

dbabokin commented Mar 4, 2024

emoon commented Mar 4, 2024

petrsm commented Mar 4, 2024

dbabokin commented Mar 4, 2024

petrsm commented Mar 4, 2024

nurmukhametov commented Apr 5, 2024

petrsm commented Apr 5, 2024