-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #4265: Add appropriate compiler defines and flags for SIMD with MSVC #4266
Fix #4265: Add appropriate compiler defines and flags for SIMD with MSVC #4266
Conversation
…and flags for SIMD with MSVC Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Oh forgot to mention: The existing CMake there is quite suspect as it lumps the "Intel" compiler in with MSVC. This is most certainly incorrect as their compiler is based on LLVM. Should we remove? |
There are two intel compilers: the classic icc behaves a lot like MSVC, the newer llvm-based icx identifies as a clang-based compiler. They are called different things in our build scripts. |
Ah I see now, yes. There's both |
src/cmake/compiler.cmake
Outdated
string (REPLACE "," ";" SIMD_FEATURE_LIST ${USE_SIMD}) | ||
foreach (feature ${SIMD_FEATURE_LIST}) | ||
message (VERBOSE "SIMD feature: ${feature}") | ||
if (MSVC OR CMAKE_COMPILER_IS_INTEL) | ||
list (APPEND SIMD_COMPILE_FLAGS "/arch:${feature}") | ||
if (feature STREQUAL "sse2") | ||
add_compile_definitions(__SSE2__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about directly setting __SSE2__
and other compile definitions. My understanding is that you should use -mXXX
on gcc/clang and /arch:XXX
on MSVS, and that the compilers set the __BLAH__
symbols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in general, I think we've moved away from directly using add_compile_definitions
globally (dubious if we want to be safe to be included as a subproject), and instead (as you can see in the surrounding code) appending to SIMD_COMPILE_FLAGS, which will later be used for proj_add_compile_options()
which is our OIIO-specific way to collect compile options that will be used for our individual libraries and binaries that we are building.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a bit more context in the Issue for this PR but MSVC will not set the __SSE*__
defines at all (and there are no arch
flags for SSE in general). The only arch flags that exist are for AVX, which oddly enough will set the __AVX*__
defines on its own.
https://learn.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-170
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So gcc/clang will set the __BLAH__
symbols based on the -m
settings, but with MSVS it's hit or miss? In MS land, how are you expected to test (#if
, etc) the settings in the code?
OK, so I retract what I said about defining the symbols -- let's do that on MSVS (and icc) to ensure they're set. I'm very sure it's not needed in the gcc/clang compatible compilers.
Then the only suggestion I have is to use our proj_add_compile_options
instead of setting them globally with add_compile_definitions
so that we aren't polluting the definitions of any surrounding project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but with MSVS it's hit or miss?
Not so much "hit or miss", more than MSVC simply lacks any machinery to target SSE3/SSE4 at all. It only ever has "please target SSE2" (which is implied by mere x64 for all practical purposes these days), or "please target AVX1 or AVX2", for which it does set proper preprocessor defines. For e.g. "I'm compiling for SSE4" case there simply is nothing - the compiler itself will never generate SSE4 instructions, but if you use intrinsics yourself, then they will work, and any runtime checks to guard them (or not do that) is your responsibility.
So realistically people end up doing things like:
- Do I have SSE2? Check if either
__SSE2__
is defined (clang/gcc) or we're simply compiling for x64 (i.e. either__x86_64__
or_M_X64
is defined) - Do I have various AVX variants? Check various AVX defines (
__AVX__
,__AVX2__
,__AVX512F__
) - Do I have SSE3 or SSE4? For gcc/clang check the defines, for MSVC it is "uhh, indicate presence of them in some other way". Which is exactly what this PR is trying to achieve here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand that not all target levels are directly supported in MSVS - only SSE2, AVX, AVX2, AVX512.
What I meant by "hit or miss" is that with MSVS, selecting AVX varieties from the command line defines the __AVXx__
symbols, but selecting (or defaulting to, on x86_64) SSE does not set any of the __SSEx__
symbols. Gcc/clang do it uniformly, so I was not aware MSVS did not set the symbols for the SSE levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notwithstanding the notes about exactly which symbols to define or helper functions to call, this is definitely on the right track to correctly turn these into the few specific fags needed for MSVS which differ from gcc/clang -m
that can take all sorts of fine-grained pieces.
src/cmake/compiler.cmake
Outdated
string (REPLACE "," ";" SIMD_FEATURE_LIST ${USE_SIMD}) | ||
foreach (feature ${SIMD_FEATURE_LIST}) | ||
message (VERBOSE "SIMD feature: ${feature}") | ||
if (MSVC OR CMAKE_COMPILER_IS_INTEL) | ||
list (APPEND SIMD_COMPILE_FLAGS "/arch:${feature}") | ||
if (feature STREQUAL "sse2") | ||
add_compile_definitions(__SSE2__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in general, I think we've moved away from directly using add_compile_definitions
globally (dubious if we want to be safe to be included as a subproject), and instead (as you can see in the surrounding code) appending to SIMD_COMPILE_FLAGS, which will later be used for proj_add_compile_options()
which is our OIIO-specific way to collect compile options that will be used for our individual libraries and binaries that we are building.
Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
3b31ddd
into
AcademySoftwareFoundation:master
…cademySoftwareFoundation#4266) This should address the following 2 issues when using MSVC: - MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must do that ourselves - MSVC uses the `/arch` flag differently than other compilers and it should be set only once The ramifications of not having those defines set basically means that OIIO won't notice any SIMD is possible. For MSVC the /arch flag only controls which instruction set the compiler may use when generating code "on its own". For example, the x64 compiler by default will use SSE2 whenever possible when its generating code but when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE: case matters here too! "avx" does not work but "AVX" does. However, if the source code uses intrinsics, MSVC will happily use those instead. In fact it's possible to use `avx512` intrinsics without specifying any /arch flag at all and it will happily generate corresponding avx512 code. This is much different than say GCC where, in order to use intrinsics at all, you must specify the corresponding flags allowing you to do so for the arch in question. Fixes AcademySoftwareFoundation#4265 ## Tests Existing simd_test.cpp should be sufficient I think. In addition to those, I've verified the below locally: Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the wrong `build:simd` flags: `sse2` After: using the same flags as above: `sse2,sse3,ssse3,sse41,sse42,avx,avx2` --------- Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
…cademySoftwareFoundation#4266) This should address the following 2 issues when using MSVC: - MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must do that ourselves - MSVC uses the `/arch` flag differently than other compilers and it should be set only once The ramifications of not having those defines set basically means that OIIO won't notice any SIMD is possible. For MSVC the /arch flag only controls which instruction set the compiler may use when generating code "on its own". For example, the x64 compiler by default will use SSE2 whenever possible when its generating code but when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE: case matters here too! "avx" does not work but "AVX" does. However, if the source code uses intrinsics, MSVC will happily use those instead. In fact it's possible to use `avx512` intrinsics without specifying any /arch flag at all and it will happily generate corresponding avx512 code. This is much different than say GCC where, in order to use intrinsics at all, you must specify the corresponding flags allowing you to do so for the arch in question. Fixes AcademySoftwareFoundation#4265 Existing simd_test.cpp should be sufficient I think. In addition to those, I've verified the below locally: Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the wrong `build:simd` flags: `sse2` After: using the same flags as above: `sse2,sse3,ssse3,sse41,sse42,avx,avx2` --------- Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Description
This should address the following 2 issues when using MSVC:
__SSE2__
or__SSE4_2__
etc. we must do that ourselves/arch
flag differently than other compilers and it should be set only onceThe ramifications of not having those defines set basically means that OIIO won't notice any SIMD is possible.
For MSVC the /arch flag only controls which instruction set the compiler may use when generating code "on its own". For example, the x64 compiler by default will use SSE2 whenever possible when its generating code but when using
/arch:AVX
it will produce AVX code as it sees fit. NOTE: case matters here too! "avx" does not work but "AVX" does.However, if the source code uses intrinsics, MSVC will happily use those instead. In fact it's possible to use
avx512
intrinsics without specifying any /arch flag at all and it will happily generate corresponding avx512 code. This is much different than say GCC where, in order to use intrinsics at all, you must specify the corresponding flags allowing you to do so for the arch in question.Fixes #4265
Tests
Existing simd_test.cpp should be sufficient I think. In addition to those, I've verified the below locally:
Currently: when using
-DUSE_SIMD="sse4.2,avx2"
to build, we get the wrongbuild:simd
flags:sse2
After: using the same flags as above:
sse2,sse3,ssse3,sse41,sse42,avx,avx2
Checklist:
(adding new test cases if necessary).
corresponding Python bindings (and if altering ImageBufAlgo functions, also
exposed the new functionality as oiiotool options).
already run clang-format before submitting, I definitely will look at the CI
test that runs clang-format and fix anything that it highlights as being
nonconforming.