Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Intrinsic::fmuladd translation. #2162

Open
haonanya opened this issue Sep 26, 2023 · 3 comments
Open

Enhance Intrinsic::fmuladd translation. #2162

haonanya opened this issue Sep 26, 2023 · 3 comments

Comments

@haonanya
Copy link
Contributor

haonanya commented Sep 26, 2023

-cl-mad-enable is off by default, however ReplaceLLVMFmulAddWithOpenCLMad is true by default https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/main/include/LLVMSPIRVOpts.h#L244C8-L244C40, which seems not consistent. Intrinsic::fmuladd should be translated into mad when function attribute "less-precise-fpmad" = "true" which indicates -cl-mad-enable is specified. When "less-precise-fpmad" = "false" or there is no "less-precise-fpmad", translate it into a pair of fmul + fadd.
Can we remove ReplaceLLVMFmulAddWithOpenCLMad uses and let only "less-precise-fpmad" determine how to translate fmuladd?
@AlexeySachkov, @MrSidims, @svenvh , do you think is ReplaceLLVMFmulAddWithOpenCLMad still needed and what's your opinion for above comments? Thanks very much. #2163 is drafted and need further discussion.

@wenju-he
Copy link
Contributor

fmuladd should be translated to fmuladd as it is. It is likely the lack of support of the translation in spirv spec so that it is translate to mad.

@AlexeySachkov
Copy link
Contributor

AlexeySachkov commented Sep 27, 2023

fmuladd should be translated to fmuladd as it is. It is likely the lack of support of the translation in spirv spec so that it is translate to mad.

That is correct. We don't have a direct equivalent of fmuladd in SPIR-V and therefore we have decided to go with mad instead.

I'm not entirely sure if we only need to use mad when -cl-mad-enable is present, because for FULL profile mad is expected to have the same semantics as fmuladd. Therefore, the current approach is valid for FULL profile. At first glance, it seems to me that if someone is targeting an embedded profile where mad could be less precise, then they should switch the value of the option.

Tagging @bashbaug for awareness as well

@bashbaug
Copy link
Contributor

I agree with @AlexeySachkov - for full profile devices the OpenCL mad extended instruction has the same semantics as the LLVM fmuladd.

LLVM: "The ‘llvm.fmuladd.*’ intrinsic functions represent multiply-add expressions that can be fused if the code generator determines that (a) the target instruction set has support for a fused operation, and (b) that the fused operation is more efficient than the equivalent, separate pair of mul and add instructions."

SPIR-V: "OpExtInst mad - Implemented either as a correctly rounded fma, or as a multiply followed by an add, both of which are correctly rounded."

We have an internal Khronos tracker to improve this (SPIR-V issue 409) and support fmuladd more directly, but because it is only needed for embedded profile devices we currently consider it low priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants