Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API changes associated with move to runtime libraries operation names and version bump? #189

Open
munroesj52 opened this issue Feb 4, 2024 · 1 comment

Comments

@munroesj52
Copy link
Contributor

Originally PVECLIB was inline only as the functionality gap across P6/P7/P8 was not so large.

But with P9/P10 some of the implementations for P10/P9 instructions are quite large for P8/P7. This is especially so for quadword integer multiply/divide and Float128. Some of the original DW/QW multiply operations should have library implementations for size but still need inline versions as these are building blocks for int512 multiplies and are needed for remainder verification in divide/divide extended. Also divide/divide-extended are used in the double quadword long division implementations.

There is a temptation to rename the inline operation to name_inline and use the original name as the extern symbol for library implementation. This does not technically change the API but does change the ABI as a user of that symbol has to now link their library/application to the PVECLIB runtime library.

I can not say that I have actually done this but I can't swear that I have not. Frankly I can't remember.

The question is does this justify a version bump?

@munroesj52
Copy link
Contributor Author

More thinking.

The divide/modulo operations are new with P10 and are not included in V1.0.4.-5 . So those are not APi/ABi issues yet. So updating current master to have for example: vec_divdqu() extern and vec_divdqu_inline() as the base implementation would be OK.

The doubleword/quadword multiplies operations are prominent in V1.0.4-5 and earlier releases. For example: vec_mulhuq(), vec_mulluq(), vec_muludq(). For PWR8 vec_muludq() exspands into 36 instructions abd runs 50+cycles.

This would be the best example of the API/ABI naming issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant