Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical error introduced using ARM64 using clang-14 but not clang-13 #91824

Closed
phargogh opened this issue May 10, 2024 · 5 comments
Closed
Labels
backend:AArch64 floating-point Floating-point math question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@phargogh
Copy link

Hello! I am experiencing an issue with clang on ARM64 (reproducible on M1 mac and Raspberry Pi 4B) where chaining a series of mathematical operations together introduces some numerical error. I have a workaround where breaking some of the math onto a second line does not introduce the error. I have tested this on a Raspberry Pi 4B and also on the following debian versions on my M1 mac:

  • debian buster (arm64, gcc 8.3.0, clang 7.0.1, no numerical issue)
  • debian bullseye (arm64, gcc 10.2.1, clang 11.0.1-2, no numerical issue)
  • debian bookworm (arm64, gcc 12.2.0, clang 13.0.1, no numerical issue)
  • debian bookworm (arm64, gcc 12.2.0, clang 14.0.6, numerical issue present)
  • debian trixie (arm64, gcc 13.2.0, clang 16.0.6, numerical issue present)
  • debian sid (arm64, gcc 13.2.0, clang 16.0.6, numerical issue present)

A minimal reproducible sample is here in this gist: https://gist.github.com/phargogh/c4264b37e7f0beed31661eacce53d14a

Thank you!

@topperc
Copy link
Collaborator

topperc commented May 10, 2024

Try passing -ffp-contract=off. The default was changed to -ffp-contract=on in clang 14.

-ffp-contract=on enables the use of FMA instructions for A * B + C if they appear in the same expression. FMA keeps the full precision of the multiply result before doing the addition. Using -ffp-contract=off causes the multiply result to be rounded to a double before doing the addition.

@llvmbot
Copy link
Collaborator

llvmbot commented May 10, 2024

@llvm/issue-subscribers-backend-aarch64

Author: James Douglass (phargogh)

Hello! I am experiencing an issue with clang on ARM64 (reproducible on M1 mac and Raspberry Pi 4B) where chaining a series of mathematical operations together introduces some numerical error. I have a workaround where breaking some of the math onto a second line does not introduce the error. I have tested this on a Raspberry Pi 4B and also on the following debian versions on my M1 mac:
  • debian buster (arm64, gcc 8.3.0, clang 7.0.1, no numerical issue)
  • debian bullseye (arm64, gcc 10.2.1, clang 11.0.1-2, no numerical issue)
  • debian bookworm (arm64, gcc 12.2.0, clang 13.0.1, no numerical issue)
  • debian bookworm (arm64, gcc 12.2.0, clang 14.0.6, numerical issue present)
  • debian trixie (arm64, gcc 13.2.0, clang 16.0.6, numerical issue present)
  • debian sid (arm64, gcc 13.2.0, clang 16.0.6, numerical issue present)

A minimal reproducible sample is here in this gist: https://gist.github.com/phargogh/c4264b37e7f0beed31661eacce53d14a

Thank you!

@EugeneZelenko EugeneZelenko added the floating-point Floating-point math label May 10, 2024
@phargogh
Copy link
Author

Thanks @topperc ! I confirm that the behavior is "restored" with -ffp-contract=off. I'll make sure our builds reflect this extra flag.

Although, I'm a little perplexed about why this behavior is creating different results on ARM64 vs X86_64. Any ideas?

phargogh added a commit to phargogh/invest that referenced this issue May 13, 2024
@arsenm
Copy link
Contributor

arsenm commented May 13, 2024

Thanks @topperc ! I confirm that the behavior is "restored" with -ffp-contract=off. I'll make sure our builds reflect this extra flag.

Although, I'm a little perplexed about why this behavior is creating different results on ARM64 vs X86_64. Any ideas?

Because the decision is based on whether the backend thinks FMA is fast. That will always be yes on arm64, and for base x86_64 it will be no. If you use -mfma or target any modernish CPU you'll probably get the fused result

@phargogh
Copy link
Author

Thanks for the information! I think my issue is resolved here, so I'll go ahead and close the issue.

@EugeneZelenko EugeneZelenko added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 floating-point Floating-point math question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

5 participants