-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] SIMD support does not work correctly when building with MSVC #4265
Comments
jessey-git
added a commit
to jessey-git/oiio
that referenced
this issue
May 16, 2024
…and flags for SIMD with MSVC Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
5 tasks
lgritz
pushed a commit
that referenced
this issue
May 21, 2024
…4266) This should address the following 2 issues when using MSVC: - MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must do that ourselves - MSVC uses the `/arch` flag differently than other compilers and it should be set only once The ramifications of not having those defines set basically means that OIIO won't notice any SIMD is possible. For MSVC the /arch flag only controls which instruction set the compiler may use when generating code "on its own". For example, the x64 compiler by default will use SSE2 whenever possible when its generating code but when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE: case matters here too! "avx" does not work but "AVX" does. However, if the source code uses intrinsics, MSVC will happily use those instead. In fact it's possible to use `avx512` intrinsics without specifying any /arch flag at all and it will happily generate corresponding avx512 code. This is much different than say GCC where, in order to use intrinsics at all, you must specify the corresponding flags allowing you to do so for the arch in question. Fixes #4265 ## Tests Existing simd_test.cpp should be sufficient I think. In addition to those, I've verified the below locally: Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the wrong `build:simd` flags: `sse2` After: using the same flags as above: `sse2,sse3,ssse3,sse41,sse42,avx,avx2` --------- Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
lgritz
pushed a commit
to lgritz/OpenImageIO
that referenced
this issue
May 22, 2024
…cademySoftwareFoundation#4266) This should address the following 2 issues when using MSVC: - MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must do that ourselves - MSVC uses the `/arch` flag differently than other compilers and it should be set only once The ramifications of not having those defines set basically means that OIIO won't notice any SIMD is possible. For MSVC the /arch flag only controls which instruction set the compiler may use when generating code "on its own". For example, the x64 compiler by default will use SSE2 whenever possible when its generating code but when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE: case matters here too! "avx" does not work but "AVX" does. However, if the source code uses intrinsics, MSVC will happily use those instead. In fact it's possible to use `avx512` intrinsics without specifying any /arch flag at all and it will happily generate corresponding avx512 code. This is much different than say GCC where, in order to use intrinsics at all, you must specify the corresponding flags allowing you to do so for the arch in question. Fixes AcademySoftwareFoundation#4265 ## Tests Existing simd_test.cpp should be sufficient I think. In addition to those, I've verified the below locally: Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the wrong `build:simd` flags: `sse2` After: using the same flags as above: `sse2,sse3,ssse3,sse41,sse42,avx,avx2` --------- Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
lgritz
pushed a commit
to lgritz/OpenImageIO
that referenced
this issue
May 26, 2024
…cademySoftwareFoundation#4266) This should address the following 2 issues when using MSVC: - MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must do that ourselves - MSVC uses the `/arch` flag differently than other compilers and it should be set only once The ramifications of not having those defines set basically means that OIIO won't notice any SIMD is possible. For MSVC the /arch flag only controls which instruction set the compiler may use when generating code "on its own". For example, the x64 compiler by default will use SSE2 whenever possible when its generating code but when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE: case matters here too! "avx" does not work but "AVX" does. However, if the source code uses intrinsics, MSVC will happily use those instead. In fact it's possible to use `avx512` intrinsics without specifying any /arch flag at all and it will happily generate corresponding avx512 code. This is much different than say GCC where, in order to use intrinsics at all, you must specify the corresponding flags allowing you to do so for the arch in question. Fixes AcademySoftwareFoundation#4265 Existing simd_test.cpp should be sufficient I think. In addition to those, I've verified the below locally: Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the wrong `build:simd` flags: `sse2` After: using the same flags as above: `sse2,sse3,ssse3,sse41,sse42,avx,avx2` --------- Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Broadly speaking there are two issues here:
/arch
switch only supports the following items[1]:AVX|AVX2|AVX512
__SSE4_2__
, or similar, meaning all of the machinery insidesimd.h
is unused[2][1] https://learn.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-170
Trying to use
USE_SIMD="sse4.2"
will result in the following warnings during build:[2] When using
USE_SIMD="sse4.2"
The cmake output says that things are happening:
But trying the resulting binaries results in the following attribute values:
OpenImageIO version and dependencies
2.5.x and master
I'll see if I can make a fix to untangle but it's a bit complicated.
The text was updated successfully, but these errors were encountered: