Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX optimization opportunities #135

Open
sherief opened this issue Dec 10, 2021 · 2 comments
Open

AVX optimization opportunities #135

sherief opened this issue Dec 10, 2021 · 2 comments

Comments

@sherief
Copy link

sherief commented Dec 10, 2021

I'm compiling Ozz Animation for AVX and I noticed in simd_math_sse-inl.h that with only OZZ_SHUFFLE_PS1() is specialized for AVX. I thought there might be opportunities to implement > SSE2 intrinsics for things like hadd etc, but I was wondering whether there's a reason they weren't already used or was it just a lack of time to implement them? I don't mind implementing them and submitting a PR, but if you've looked into it already and decided for one reason or another that they don't help with perf or don't fit in then I can just save the time / effort.

@guillaumeblanc
Copy link
Owner

Hi,

If I remember correctly, I looked at hadd but eventually didn't use it (apparently). I think there was no clear benefit from using hadd for the use case I studied (https://github.com/guillaumeblanc/ozz-animation/blob/master/include/ozz/base/maths/internal/simd_math_sse-inl.h#L65), because hadd only adds 2 components: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=hadd&ig_expand=3846.

Looking at that thread https://stackoverflow.com/questions/6996764/fastest-way-to-do-horizontal-sse-vector-sum-or-other-reduction, there might be something better to do with shuffles.

If you find something better to do with hadd, or any other SSE2+ intrinsic, a PR will be very welcome for sure !

Note though that dot product intrinsic is used when available.

Cheers,
Guillaume

@guillaumeblanc
Copy link
Owner

Hi, any news on that front ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants