Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PowerPC, improve code generation for function fvec_L2sqr #3416

Closed
wants to merge 1 commit into from

Conversation

carll99
Copy link

@carll99 carll99 commented May 7, 2024

The code generated for function fvec_L2sqr generated by OpenXL do not perform as good as the codes generated by gcc on Power. The macros to enable imprecise floating point operation don’t cover Power with OpenXL. This patch adds the OpenXL compiler options for the PowerPC macros to achieve better performance.

The code generated for function fvec_L2sqr generated by OpenXL do not
perform as good as the codes generated by gcc on Power. The macros to
enable imprecise floating point operation don’t cover Power with OpenXL.
This patch adds the OpenXL compiler options for the PowerPC macros to
achieve better performance.
@alexanderguzhva
Copy link
Contributor

lgtm

@alexanderguzhva
Copy link
Contributor

@carll99 does it improve the code for dot product as well, did you have a chance to take a look? :)

@carll99
Copy link
Author

carll99 commented May 8, 2024

We took a look at the dot product and the performance appears to be the same as without the patch.

@mdouze
Copy link
Contributor

mdouze commented May 10, 2024

lgtm

@facebook-github-bot
Copy link
Contributor

@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mdouze merged this pull request in e1e4ad0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants