Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] SIMD support does not work correctly when building with MSVC #4265

Closed
jessey-git opened this issue May 15, 2024 · 0 comments · Fixed by #4266
Closed

[BUG] SIMD support does not work correctly when building with MSVC #4265

jessey-git opened this issue May 15, 2024 · 0 comments · Fixed by #4266

Comments

@jessey-git
Copy link
Contributor

Describe the bug

Broadly speaking there are two issues here:

  • MSVC's /arch switch only supports the following items[1]: AVX|AVX2|AVX512
  • MSVC does not define __SSE4_2__, or similar, meaning all of the machinery inside simd.h is unused[2]

[1] https://learn.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-170
Trying to use USE_SIMD="sse4.2" will result in the following warnings during build:

cl : command line  warning D9002: ignoring unknown option '/arch:sse4.2'

[2] When using USE_SIMD="sse4.2"
The cmake output says that things are happening:

-- Building with C++17, downstream minimum C++17
-- Compiling with SIMD level sse4.2
-- SIMD feature: sse4.2

But trying the resulting binaries results in the following attribute values:

hw:simd = sse2,sse3,ssse3,sse41,sse42,avx,avx2,fma,f16c,popcnt,rdrand
build:simd = sse2

OpenImageIO version and dependencies
2.5.x and master

I'll see if I can make a fix to untangle but it's a bit complicated.

jessey-git added a commit to jessey-git/oiio that referenced this issue May 16, 2024
…and flags for SIMD with MSVC

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
lgritz pushed a commit that referenced this issue May 21, 2024
…4266)

This should address the following 2 issues when using MSVC:
- MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must
do that ourselves
- MSVC uses the `/arch` flag differently than other compilers and it
should be set only once

The ramifications of not having those defines set basically means that
OIIO won't notice any SIMD is possible.

For MSVC the /arch flag only controls which instruction set the compiler
may use when generating code "on its own". For example, the x64 compiler
by default will use SSE2 whenever possible when its generating code but
when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE:
case matters here too! "avx" does not work but "AVX" does.

However, if the source code uses intrinsics, MSVC will happily use those
instead. In fact it's possible to use `avx512` intrinsics without
specifying any /arch flag at all and it will happily generate
corresponding avx512 code. This is much different than say GCC where, in
order to use intrinsics at all, you must specify the corresponding flags
allowing you to do so for the arch in question.

Fixes #4265

## Tests

Existing simd_test.cpp should be sufficient I think. In addition to
those, I've verified the below locally:

Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the
wrong `build:simd` flags:
`sse2`

After: using the same flags as above:
`sse2,sse3,ssse3,sse41,sse42,avx,avx2`

---------

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
lgritz pushed a commit to lgritz/OpenImageIO that referenced this issue May 22, 2024
…cademySoftwareFoundation#4266)

This should address the following 2 issues when using MSVC:
- MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must
do that ourselves
- MSVC uses the `/arch` flag differently than other compilers and it
should be set only once

The ramifications of not having those defines set basically means that
OIIO won't notice any SIMD is possible.

For MSVC the /arch flag only controls which instruction set the compiler
may use when generating code "on its own". For example, the x64 compiler
by default will use SSE2 whenever possible when its generating code but
when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE:
case matters here too! "avx" does not work but "AVX" does.

However, if the source code uses intrinsics, MSVC will happily use those
instead. In fact it's possible to use `avx512` intrinsics without
specifying any /arch flag at all and it will happily generate
corresponding avx512 code. This is much different than say GCC where, in
order to use intrinsics at all, you must specify the corresponding flags
allowing you to do so for the arch in question.

Fixes AcademySoftwareFoundation#4265

## Tests

Existing simd_test.cpp should be sufficient I think. In addition to
those, I've verified the below locally:

Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the
wrong `build:simd` flags:
`sse2`

After: using the same flags as above:
`sse2,sse3,ssse3,sse41,sse42,avx,avx2`

---------

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
lgritz pushed a commit to lgritz/OpenImageIO that referenced this issue May 26, 2024
…cademySoftwareFoundation#4266)

This should address the following 2 issues when using MSVC:
- MSVC does not set defines like `__SSE2__` or `__SSE4_2__` etc. we must
do that ourselves
- MSVC uses the `/arch` flag differently than other compilers and it
should be set only once

The ramifications of not having those defines set basically means that
OIIO won't notice any SIMD is possible.

For MSVC the /arch flag only controls which instruction set the compiler
may use when generating code "on its own". For example, the x64 compiler
by default will use SSE2 whenever possible when its generating code but
when using `/arch:AVX` it will produce AVX code as it sees fit. NOTE:
case matters here too! "avx" does not work but "AVX" does.

However, if the source code uses intrinsics, MSVC will happily use those
instead. In fact it's possible to use `avx512` intrinsics without
specifying any /arch flag at all and it will happily generate
corresponding avx512 code. This is much different than say GCC where, in
order to use intrinsics at all, you must specify the corresponding flags
allowing you to do so for the arch in question.

Fixes AcademySoftwareFoundation#4265

Existing simd_test.cpp should be sufficient I think. In addition to
those, I've verified the below locally:

Currently: when using `-DUSE_SIMD="sse4.2,avx2"` to build, we get the
wrong `build:simd` flags:
`sse2`

After: using the same flags as above:
`sse2,sse3,ssse3,sse41,sse42,avx,avx2`

---------

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant