Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility to compile multi-arch for macOS Universal Binary #921

Open
zakalawe opened this issue Sep 27, 2023 · 7 comments
Open

Possibility to compile multi-arch for macOS Universal Binary #921

zakalawe opened this issue Sep 27, 2023 · 7 comments

Comments

@zakalawe
Copy link

zakalawe commented Sep 27, 2023

I'm experimenting with making a project build as a Universal Binary, and of course that means our dependencies would ideally be built that way as well. (I know there are solutions to avoid this, but hear me out)

CMake supports the CMAKE_OSX_ARCHITECTURES property which is mapped to compiler -arch flags, and on Apple platforms, providing multiple values is allowed, eg '-arch arm64 -arch x86_64' to build a Universal macOS object. This works somehwat with OpenAl-soft already, except it falls down on the SSE and NEON intrinsics, because of course at CMake configure time there is no correct answer for if these should be compiled or not.

My proposed solution would be to adjust the config header (config.h.in) to wrap the #cmakedefines for HAVE_SSE and HAVE_NEON in the compiler's runtime preprocess defines for those features. This is a tiny bit ugly but will be contained to the config file only, everything else (eg in al_numerics.h) will switch correctly off the result, as each file is compiled for a particular architecture.

(This will also fix cross-compiling from x86_64 to arm64 ... currently this gets in a mess because xmmintrin.h is detected by CMake but compilation fails)

It's made uglier by the fact CL.exe and GCC/Clang disagree, of course, on what to call the preprocess defines they set based on CPU architecture, but, c'est la vie.

Attached an example of what I have in mind, but would like to know if this idea is generally acceptable, before proceeding much further.
config.h.txt

@kcat
Copy link
Owner

kcat commented Sep 27, 2023

The general idea is probably okay, but those particular checks aren't. __SSE__ isn't defined without the -msse switch (or a target that has SSE by default), but HAVE_SSE is used to determine if SSE-enabled mixer functions are available, which they can be without SSE being enabled globally (on some targets, -msse may only be set when compiling mixer_sse.cpp, while other sources check HAVE_SSE to know if the functions in that source exist). And _M_IX86_FP is an MSVC-only macro, which would prevent HAVE_SSE_INTRINSICS from being set on other compilers.

@zakalawe
Copy link
Author

The _M_IX86_FP I think is okay, because I'm using the or-operator to combine the different compiler checks: it's always going to look like #if defined(<MSVC_flag>) || defined(<GCC_and_Clang_flag>) in the end, for both the SSE/AVX side and the ARM/NEON side.

I need to do some much deeper reading to figure out how to detect SSE though, if you're only passing -msse selectively - that is going to complicate things quite a bit, hmmm.

@kcat
Copy link
Owner

kcat commented Sep 27, 2023

The _M_IX86_FP I think is okay, because I'm using the or-operator to combine the different compiler checks: it's always going to look like #if defined(<MSVC_flag>) || defined(<GCC_and_Clang_flag>) in the end, for both the SSE/AVX side and the ARM/NEON side.

I meant for the HAVE_SSE_INTRINSICS macro, which only checks _M_IX86_FP. Though that doesn't look appropriate itself either since MSDN says: "This macro is always defined when the compilation target is an x86 processor. Otherwise, undefined." It doesn't seem it would be defined for x64 targets. I think the check would need to be

#if defined(_M_IX86) || (defined(_M_IX64) && !defined(_M_ARM)) || defined(__i386__) || defined(__x86_64__)

To ensure it's an x86-32 or x86-64 target.

I need to do some much deeper reading to figure out how to detect SSE though, if you're only passing -msse selectively - that is going to complicate things quite a bit, hmmm.

I'd like to not be selective like that, but there are some 32-bit distros that don't want to require SSE, and having SSE versions of some functions selected at runtime on capable CPUs is beneficial (if not ideal).

@zakalawe
Copy link
Author

My gut feeling is this is going to get messy because of the handling of SSE : while I still like the idea in principle, I think doing two separate build trees and combining with 'lipo' is going to be easier and much less disruptive for your side.

I will still need a patch to enable cross-compilation however, since the detection of xmmintrin.h is based on the file but needs to be guarded by the target architecture

@MoNTE48
Copy link

MoNTE48 commented Oct 2, 2023

@zakalawe
Copy link
Author

zakalawe commented Oct 2, 2023

https://github.com/MultiCraft/MultiCraft/blob/main/Apple/scripts/openal.sh

Because you explicitly set CFLAGS / CXXFLAGS, I guess this means no NEON or SSE optimisations are used at all - which is fine, just checking I have understood how your script interacts with what OpenAL-soft does inside its own CMake files.

@MoNTE48
Copy link

MoNTE48 commented Oct 2, 2023

@zakalawe, well, another option would be to build 2 versions of openal-soft (Intel & aarch64) and merge them using lipo.

(As suggested above. But I wonder how much it will improve performance in your particular case)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants