Requesting clarification regarding the use of GNUC in xxhash code #709

rohanaggarwal7997 · 2022-05-06T00:35:04Z

I had a small question regarding the xxhash code. There is usage of __GNUC__ in some places in the code(in footnote) to get the performance of the compiled code by GCC similar to clang . My question is that even ICC defines __GNUC__. Was the code benchmarked for the performance considerations with ICC ? or should we be using an extra !defined(__INTEL_COMPILER) .

I am talking about the following two comments in your code

XXH_FORCE_INLINE xxh_u64 XXH3_mix16B(const xxh_u8* XXH_RESTRICT input,
                                     const xxh_u8* XXH_RESTRICT secret, xxh_u64 seed64)
{
#if defined(__GNUC__) && !defined(__clang__) /* GCC, not Clang */ \
  && defined(__i386__) && defined(__SSE2__)  /* x86 + SSE2 */ \
  && !defined(XXH_ENABLE_AUTOVECTORIZE)      /* Define to disable like XXH32 hack */
    /*
     * UGLY HACK:
     * GCC for x86 tends to autovectorize the 128-bit multiply, resulting in
     * slower code.
     *
     * By forcing seed64 into a register, we disrupt the cost model and
     * cause it to scalarize. See `XXH32_round()`
     *
     * FIXME: Clang's output is still _much_ faster -- On an AMD Ryzen 3600,
     * XXH3_64bits @ len=240 runs at 4.6 GB/s with Clang 9, but 3.3 GB/s on
     * GCC 9.2, despite both emitting scalar code.
     *
     * GCC generates much better scalar code than Clang for the rest of XXH3,
     * which is why finding a more optimal codepath is an interest.
     */
    XXH_COMPILER_GUARD(seed64);

/*
 * UGLY HACK:
 * GCC usually generates the best code with -O3 for xxHash.
 *
 * However, when targeting AVX2, it is overzealous in its unrolling resulting
 * in code roughly 3/4 the speed of Clang.
 *
 * There are other issues, such as GCC splitting _mm256_loadu_si256 into
 * _mm_loadu_si128 + _mm256_inserti128_si256. This is an optimization which
 * only applies to Sandy and Ivy Bridge... which don't even support AVX2.
 *
 * That is why when compiling the AVX2 version, it is recommended to use either
 *   -O2 -mavx2 -march=haswell
 * or
 *   -O2 -mavx2 -mno-avx256-split-unaligned-load
 * for decent performance, or to use Clang instead.
 *
 * Fortunately, we can control the first one with a pragma that forces GCC into
 * -O2, but the other one we can't control without "failed to inline always
 * inline function due to target mismatch" warnings.
 */

The text was updated successfully, but these errors were encountered:

Cyan4973 · 2022-05-06T00:49:44Z

I would be surprised if this code performance hack was checked on ICC,
though with godbolt, that might actually be possible.
cc @easyaspi314 , the author of this code.

easyaspi314 · 2022-05-06T01:01:43Z

Yeah those should be for GCC only.

It is really annoying that in order to check for actual GCC you have to check for the imposters.

How about just adding something like this

/* Attempts to detect *actual* GCC. */
#define XXH_IS_GCC (defined(__GNUC__) && !defined(__clang__) && !defined(__INTEL_COMPILER))
/* Detects GCC and Clang (in clang-cl mode, clang does not define __GNUC__) */
#define XXH_IS_GNU (defined(__GNUC__) || defined(__clang__))

Just like how XXH_GCC_VERSION is used.

Cyan4973 · 2022-05-06T02:05:27Z

I like it

GitMensch · 2022-06-03T12:28:53Z

So PR missing?

Cyan4973 · 2023-07-15T19:08:33Z

Coming back to this topic,
it seems there are still a few places in the code base where the following preprocessor test is present :
#if defined(__GNUC__) && !defined(__clang__)

I presume it's the kind of test where excluding icc would (presumably) be useful too.

Cyan4973 added the question label May 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requesting clarification regarding the use of GNUC in xxhash code #709

Requesting clarification regarding the use of GNUC in xxhash code #709

rohanaggarwal7997 commented May 6, 2022 •

edited by Cyan4973

Cyan4973 commented May 6, 2022

easyaspi314 commented May 6, 2022

Cyan4973 commented May 6, 2022

GitMensch commented Jun 3, 2022

Cyan4973 commented Jul 15, 2023

Requesting clarification regarding the use of __GNUC__ in xxhash code #709

Requesting clarification regarding the use of __GNUC__ in xxhash code #709

Comments

rohanaggarwal7997 commented May 6, 2022 • edited by Cyan4973

Cyan4973 commented May 6, 2022

easyaspi314 commented May 6, 2022

Cyan4973 commented May 6, 2022

GitMensch commented Jun 3, 2022

Cyan4973 commented Jul 15, 2023

Requesting clarification regarding the use of GNUC in xxhash code #709

Requesting clarification regarding the use of GNUC in xxhash code #709

rohanaggarwal7997 commented May 6, 2022 •

edited by Cyan4973