benchmarks on Atom crashes on AVX2 #77

htot · 2022-05-08T22:14:19Z

When I build with all x86 codes and run benchmarks on Atom (x86_64) it crashes due to an illegal instruction. I always thought I just need to disable AVX and AVX2 codecs (and that works around the problem). But looking into the code even though the AVX* codecs are forced, it seems support is intended to be detected. And the crash therefore unintended.

I think it would be better to benchmark the codecs for which the instructions are actually supported instead of crashing. But how we do that? Not force the codec, of detect supported codecs and force from that list?

htot · 2022-05-26T20:48:54Z

@aklomp what are your ideas on this?

aklomp · 2022-06-02T11:22:17Z

Yes, support for the AVX* codecs is detected at runtime if they are compiled in.

IIRC, there are one or more X86 instructions that are not available on Atoms, I think movbe might be one. What's the exact illegal instruction that the code is failing on? My guess is it's not an AVX* opcode, but one of these unavailable ones.

What I think is happening is that by enabling -mavx2, you are implicitly telling the compiler "this code will be run on a machine that supports AVX2". And that gives the compiler a license to use instructions that are available on all AVX2-supporting platforms, such as movbe.

The fix for this is indeed to not compile at the AVX support level, or maybe do something like march= or mtune=.

htot · 2022-06-02T12:54:41Z

Yes, support for the AVX* codecs is detected at runtime if they are compiled in.

No, that is for the library. For the benchmark codes are "forced" (if they are compiled in). And then the forced codec is tested using a test string. And this test crashes.

I made a patch that uses the detection code and solves this issue by not running the test string if there is no support in the CPU (on x86) if an additional flag is set. But I don't know if you like that since the comments mention explicitly forcing is for testing purposes.

aklomp · 2022-06-02T12:57:39Z

But that's the same thing, right? You're running code compiled for an AVX-level machine on a machine that has no AVX support. Surely the fix is to not compile the tests with AVX codecs enabled?

htot · 2022-06-02T13:06:54Z

Correct.
The problem happens only when you run the same benchmark binary on different machines.

aklomp · 2022-06-02T20:22:48Z

Ah, I think I understand the problem now. You expect the benchmark binary to be compiled for the lowest common denominator (a machine that supports only the plain codec), and only the arch-specific codecs to use any more advanced instructions.

Well, you'd be correct, that's how the test Makefile is set up. The test binaries should run on the lowest common denominator machine.

Are you compiling using the Makefile, or using the new CMake stuff?

htot · 2022-06-02T22:18:32Z

I would expect the benchmark (for x86) to run all codecs supported by the cpu - not crash. However, as the compiled in codecs are forced, an exception happens on the unsupported instructions.

I use both makefile and cmake, there is no difference. Patching so that the cpu detection code is run to see if a codec is really supported fixes the problem. This code is not working.

aklomp · 2022-06-02T22:35:09Z

Now I remember, it's been a while. Yeah, the base binary is built without any SIMD flags, but you're right, all codecs that are built with the current flags, will always be run (forced) by the binary. Forcing is necessary because otherwise the library would always choose the "best" codec. Forcing is a way to downgrade the codec to the one we want to test. That is what the comments mean when they talk about needing forcing for testing purposes.

My assumption was that the test binary didn't need to be portable because it's something that's only run by developers. That's maybe not a very good assumption, I can see the use case for wanting to benchmark across systems from a single packaged tool.

There are various ways to fix this. The cleanest one (and one that I've been thinking about in other contexts) is to expose the various codecs through some introspection interface so that the client can choose which codec to use. Then we could get rid of the whole runtime CPU feature detection thing, or make it optional. I'd like to make this change some time, but it won't be soon (because it's a lot of rework and testing).

Another way of fixing it would be to run the CPU feature detection code in the test binary and only run codecs which are compatible with the current CPU. That's probably more straightforward, if we can factor out the feature detection code so that it can be easily reused.

htot · 2022-06-04T19:32:44Z

Maybe you could have a look at PR #82 (draft) - this implements the latter.

aklomp mentioned this issue Oct 11, 2022

enable avx512 support for base64 encoding. Reuse WojciechMula/base64-… #102

Closed

jirutka mentioned this issue Nov 19, 2023

Codec detection doesn’t work in test_base64 on musl libc #124

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarks on Atom crashes on AVX2 #77

benchmarks on Atom crashes on AVX2 #77

htot commented May 8, 2022

htot commented May 26, 2022

aklomp commented Jun 2, 2022

htot commented Jun 2, 2022

aklomp commented Jun 2, 2022

htot commented Jun 2, 2022

aklomp commented Jun 2, 2022

htot commented Jun 2, 2022

aklomp commented Jun 2, 2022

htot commented Jun 4, 2022

benchmarks on Atom crashes on AVX2 #77

benchmarks on Atom crashes on AVX2 #77

Comments

htot commented May 8, 2022

htot commented May 26, 2022

aklomp commented Jun 2, 2022

htot commented Jun 2, 2022

aklomp commented Jun 2, 2022

htot commented Jun 2, 2022

aklomp commented Jun 2, 2022

htot commented Jun 2, 2022

aklomp commented Jun 2, 2022

htot commented Jun 4, 2022