Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fpu_ctrl: exclude non i386 archs from SSE #926

Open
es20490446e opened this issue Oct 5, 2023 · 7 comments
Open

fpu_ctrl: exclude non i386 archs from SSE #926

es20490446e opened this issue Oct 5, 2023 · 7 comments

Comments

@es20490446e
Copy link

To:
core/fpu_ctrl.cpp

I have seen a place that applies this patch:
mxcsr.patch

Do you think this is something that could be included here?

If so I can make a pull request.

@es20490446e
Copy link
Author

This is for compatibility with the Russian Elbrus2k CPU architecture.

@kcat
Copy link
Owner

kcat commented Oct 5, 2023

I'm curious what the failure actually is. Why SSE is detected and HAVE_SSE is set, but the stmxcsr or ldmxcsr instructions fail. Does GCC not support inline assembly on that architecture, or does it only support the intrinsics and map them to other instructions?

A better option may be to avoid the inline assembly altogether, and optionally enable use of the intrinsics when SSE is available. It'd require moving the SSE calls to a separate function that can enable SSE codegen separate from the enter/leave methods, with those functions only being called when the appropriate CPUCapFlags are set.

@es20490446e
Copy link
Author

@stgatilov Do you have any suggestion on this?

@kcat
Copy link
Owner

kcat commented Oct 6, 2023

Does commit 28ebc90 help any? It tries to avoid the inline assembly and use the intrinsics when SSE is detected at runtime.

@es20490446e
Copy link
Author

I personally cannot comment on this, because I wasn't the person creating the patch.

I will ask him to come.

@es20490446e
Copy link
Author

Request

@stgatilov
Copy link

Elbrus GCC-based compiler supports SSE intrinsics by translating them into Elbrus VLIW instructions.
But it does not support x86 assembly, cpuid and other stuff.
Thus it defines (or should define) having SSE support, but is not x86.

To be honest, I won't be surprised if some of Elbrus guys already provided some fix for this to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants