Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #4265: Add appropriate compiler defines and flags for SIMD with MSVC #4266

Merged
merged 3 commits into from
May 21, 2024

Conversation

jessey-git
Copy link
Contributor

Description

This should address the following 2 issues when using MSVC:

  • MSVC does not set defines like __SSE2__ or __SSE4_2__ etc. we must do that ourselves
  • MSVC uses the /arch flag differently than other compilers and it should be set only once

The ramifications of not having those defines set basically means that OIIO won't notice any SIMD is possible.

For MSVC the /arch flag only controls which instruction set the compiler may use when generating code "on its own". For example, the x64 compiler by default will use SSE2 whenever possible when its generating code but when using /arch:AVX it will produce AVX code as it sees fit. NOTE: case matters here too! "avx" does not work but "AVX" does.

However, if the source code uses intrinsics, MSVC will happily use those instead. In fact it's possible to use avx512 intrinsics without specifying any /arch flag at all and it will happily generate corresponding avx512 code. This is much different than say GCC where, in order to use intrinsics at all, you must specify the corresponding flags allowing you to do so for the arch in question.

Fixes #4265

Tests

Existing simd_test.cpp should be sufficient I think. In addition to those, I've verified the below locally:

Currently: when using -DUSE_SIMD="sse4.2,avx2" to build, we get the wrong build:simd flags:
sse2

After: using the same flags as above:
sse2,sse3,ssse3,sse41,sse42,avx,avx2

Checklist:

  • I have read the contribution guidelines.
  • I have updated the documentation, if applicable.
  • I have ensured that the change is tested somewhere in the testsuite
    (adding new test cases if necessary).
  • If I added or modified a C++ API call, I have also amended the
    corresponding Python bindings (and if altering ImageBufAlgo functions, also
    exposed the new functionality as oiiotool options).
  • My code follows the prevailing code style of this project. If I haven't
    already run clang-format before submitting, I definitely will look at the CI
    test that runs clang-format and fix anything that it highlights as being
    nonconforming.

…and flags for SIMD with MSVC

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
@jessey-git
Copy link
Contributor Author

Oh forgot to mention: The existing CMake there is quite suspect as it lumps the "Intel" compiler in with MSVC. This is most certainly incorrect as their compiler is based on LLVM. Should we remove?

@lgritz
Copy link
Collaborator

lgritz commented May 16, 2024

There are two intel compilers: the classic icc behaves a lot like MSVC, the newer llvm-based icx identifies as a clang-based compiler. They are called different things in our build scripts.

@jessey-git
Copy link
Contributor Author

Ah I see now, yes. There's both CMAKE_COMPILER_IS_INTEL and CMAKE_COMPILER_IS_INTELCLANG

string (REPLACE "," ";" SIMD_FEATURE_LIST ${USE_SIMD})
foreach (feature ${SIMD_FEATURE_LIST})
message (VERBOSE "SIMD feature: ${feature}")
if (MSVC OR CMAKE_COMPILER_IS_INTEL)
list (APPEND SIMD_COMPILE_FLAGS "/arch:${feature}")
if (feature STREQUAL "sse2")
add_compile_definitions(__SSE2__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about directly setting __SSE2__ and other compile definitions. My understanding is that you should use -mXXX on gcc/clang and /arch:XXX on MSVS, and that the compilers set the __BLAH__ symbols.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in general, I think we've moved away from directly using add_compile_definitions globally (dubious if we want to be safe to be included as a subproject), and instead (as you can see in the surrounding code) appending to SIMD_COMPILE_FLAGS, which will later be used for proj_add_compile_options() which is our OIIO-specific way to collect compile options that will be used for our individual libraries and binaries that we are building.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a bit more context in the Issue for this PR but MSVC will not set the __SSE*__ defines at all (and there are no arch flags for SSE in general). The only arch flags that exist are for AVX, which oddly enough will set the __AVX*__ defines on its own.

https://learn.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-170

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So gcc/clang will set the __BLAH__ symbols based on the -m settings, but with MSVS it's hit or miss? In MS land, how are you expected to test (#if, etc) the settings in the code?

OK, so I retract what I said about defining the symbols -- let's do that on MSVS (and icc) to ensure they're set. I'm very sure it's not needed in the gcc/clang compatible compilers.

Then the only suggestion I have is to use our proj_add_compile_options instead of setting them globally with add_compile_definitions so that we aren't polluting the definitions of any surrounding project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but with MSVS it's hit or miss?
Not so much "hit or miss", more than MSVC simply lacks any machinery to target SSE3/SSE4 at all. It only ever has "please target SSE2" (which is implied by mere x64 for all practical purposes these days), or "please target AVX1 or AVX2", for which it does set proper preprocessor defines. For e.g. "I'm compiling for SSE4" case there simply is nothing - the compiler itself will never generate SSE4 instructions, but if you use intrinsics yourself, then they will work, and any runtime checks to guard them (or not do that) is your responsibility.

So realistically people end up doing things like:

  • Do I have SSE2? Check if either __SSE2__ is defined (clang/gcc) or we're simply compiling for x64 (i.e. either __x86_64__ or _M_X64 is defined)
  • Do I have various AVX variants? Check various AVX defines (__AVX__, __AVX2__, __AVX512F__)
  • Do I have SSE3 or SSE4? For gcc/clang check the defines, for MSVC it is "uhh, indicate presence of them in some other way". Which is exactly what this PR is trying to achieve here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand that not all target levels are directly supported in MSVS - only SSE2, AVX, AVX2, AVX512.

What I meant by "hit or miss" is that with MSVS, selecting AVX varieties from the command line defines the __AVXx__ symbols, but selecting (or defaulting to, on x86_64) SSE does not set any of the __SSEx__ symbols. Gcc/clang do it uniformly, so I was not aware MSVS did not set the symbols for the SSE levels.

Copy link
Collaborator

@lgritz lgritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notwithstanding the notes about exactly which symbols to define or helper functions to call, this is definitely on the right track to correctly turn these into the few specific fags needed for MSVS which differ from gcc/clang -m that can take all sorts of fine-grained pieces.

string (REPLACE "," ";" SIMD_FEATURE_LIST ${USE_SIMD})
foreach (feature ${SIMD_FEATURE_LIST})
message (VERBOSE "SIMD feature: ${feature}")
if (MSVC OR CMAKE_COMPILER_IS_INTEL)
list (APPEND SIMD_COMPILE_FLAGS "/arch:${feature}")
if (feature STREQUAL "sse2")
add_compile_definitions(__SSE2__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in general, I think we've moved away from directly using add_compile_definitions globally (dubious if we want to be safe to be included as a subproject), and instead (as you can see in the surrounding code) appending to SIMD_COMPILE_FLAGS, which will later be used for proj_add_compile_options() which is our OIIO-specific way to collect compile options that will be used for our individual libraries and binaries that we are building.

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Copy link
Collaborator

@lgritz lgritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@lgritz lgritz merged commit 3b31ddd into AcademySoftwareFoundation:master May 21, 2024
23 of 24 checks passed
lgritz pushed a commit to lgritz/OpenImageIO that referenced this pull request May 22, 2024
…cademySoftwareFoundation#4266)

This should address the following 2 issues when using MSVC:
- MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must
do that ourselves
- MSVC uses the `/arch` flag differently than other compilers and it
should be set only once

The ramifications of not having those defines set basically means that
OIIO won't notice any SIMD is possible.

For MSVC the /arch flag only controls which instruction set the compiler
may use when generating code "on its own". For example, the x64 compiler
by default will use SSE2 whenever possible when its generating code but
when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE:
case matters here too! "avx" does not work but "AVX" does.

However, if the source code uses intrinsics, MSVC will happily use those
instead. In fact it's possible to use `avx512` intrinsics without
specifying any /arch flag at all and it will happily generate
corresponding avx512 code. This is much different than say GCC where, in
order to use intrinsics at all, you must specify the corresponding flags
allowing you to do so for the arch in question.

Fixes AcademySoftwareFoundation#4265

## Tests

Existing simd_test.cpp should be sufficient I think. In addition to
those, I've verified the below locally:

Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the
wrong `build:simd` flags:
`sse2`

After: using the same flags as above:
`sse2,sse3,ssse3,sse41,sse42,avx,avx2`

---------

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
lgritz pushed a commit to lgritz/OpenImageIO that referenced this pull request May 26, 2024
…cademySoftwareFoundation#4266)

This should address the following 2 issues when using MSVC:
- MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must
do that ourselves
- MSVC uses the `/arch` flag differently than other compilers and it
should be set only once

The ramifications of not having those defines set basically means that
OIIO won't notice any SIMD is possible.

For MSVC the /arch flag only controls which instruction set the compiler
may use when generating code "on its own". For example, the x64 compiler
by default will use SSE2 whenever possible when its generating code but
when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE:
case matters here too! "avx" does not work but "AVX" does.

However, if the source code uses intrinsics, MSVC will happily use those
instead. In fact it's possible to use `avx512` intrinsics without
specifying any /arch flag at all and it will happily generate
corresponding avx512 code. This is much different than say GCC where, in
order to use intrinsics at all, you must specify the corresponding flags
allowing you to do so for the arch in question.

Fixes AcademySoftwareFoundation#4265

Existing simd_test.cpp should be sufficient I think. In addition to
those, I've verified the below locally:

Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the
wrong `build:simd` flags:
`sse2`

After: using the same flags as above:
`sse2,sse3,ssse3,sse41,sse42,avx,avx2`

---------

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] SIMD support does not work correctly when building with MSVC
3 participants