AArch64: Fix fmla and fmls element count and size for halfword cases #6543
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As part of a research project testing the accuracy of the sleigh specifications compared to real hardware, we observed an unexpected behaviour in the fmla and fmls instructions for AARCH64. According to Sections C7.2.122 and C7.2.126, the expected behaviour is operate on 16 bit floats when the 4H or 8H registers are used. While the current behaviour instead treats adjacent pairs of 16 bit floats as a single 32 bit float.
e.g.:
0xe70e410e
"fmla v7.4H, v23.4H, v1.4H" with z7=0x82ce9a6474c3d5fa, z23=0x63518afba03f54aa and z1=0x7223a9bdc6af50c8Hardware Reference: z7 = 0x7c009a5f74c36963
Existing Spec: z7 = 0x7f80000074c3d5fa
Patched Spec: z7 = 0x7c009a5f74c36963