Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Fp IEEE compliance level flags for Wasm (FP-Fast-Math for Wasm Scalar & SIMD) #1393

Closed
arunetm opened this issue Jan 12, 2021 · 8 comments

Comments

@arunetm
Copy link
Contributor

arunetm commented Jan 12, 2021

Native compilers like gcc, clang, msvc allow developers to set fp IEEE compliance levels through pragmas or compile-time flags like /fp(strict-fast), ffast-math, -ffp-contract, etc. These flags are beneficial for a wide range of applications (e.g. ML-convolutions, DSP, low precision graphics/physics engines..) where developers prefer the trading off portable-precision in favor of performance.

The performance gains can be significant depending on the specific settings and compilers used. These developer hints/preferences allow compilers to safely perform more Fp math optimizations, better instruction selection(fusing/fma), value-range restrictions and relaxed validations.

Currently, these flags when specified by developers for Wasm are consumed by the developer toolchain and they may honor a few depending on the specific tool used (e.g. WebAssembly/binaryen#3155). These preferences information will be discarded are not available/visible for the runtimes if they desire to use them.

Wasm runtimes can benefit from having the means to access these developer flags/hints to make their own decisions on optimization and instruction selection when it's safe to do so. This will be particularly useful to perform additional runtime optimizations, especially in AOT wasm compilers. This also helps to address a few of the known performance concerns in FP Wasm SIMD codegen like rounding, min/max etc. One a high level, this will allow runtimes not to be dependent on developer tools for certain FP optimizations and removes a blocker for Wasm to track native performance more closely.

There is the precedent of JVM 1.2 relaxing IEEE compliance as the default mode and introducing 'strictfp' modifier to ensure portability in a class/interface/method granularity. There is the opportunity to explore a more backward-compatible approach for Wasm.

I would like to propose a mechanism to encode fp IEEE compliance flags in the Wasm binary to be consumed by the runtime engines. As is the case in native languages, the flags themselves can be treated as optional and their use can be a choice of the runtimes. The impact will manifest as improved performance, consistent semantics on a given platform, and lesser platform portability. The proposed mechanism will enable unambiguously marking code sections within a Wasm binary with these preferences/hints at the granularity of a block of instructions.

The design can be ironed out in detail as we proceed, One option is to introduce a new custom section with entries marking preferences and code segment offsets and another option is to introduce a new instruction to mark code segments with these developer preferences like in JVM. The specific flags to support can be discussed and incorporated as we proceed (e.g. -fp-finate-math-only, -fp-no-signed-zeros..)

This mechanism will complement the discussions to add Scalar/Vector FMA, and FP approximation instructions to Wasm and/or SIMD spec.

This issue is to track the interest in this topic and to discuss this in the CG sync.

arunetm added a commit to arunetm/meetings that referenced this issue Jan 12, 2021
Adding an agenda item to discuss WebAssembly/design#1393
@tlively
Copy link
Member

tlively commented Jan 12, 2021

Since the strictness of fp operations affects the semantics of a program, I would expect that the best way to make more permissive semantics possible would be to introduce new instructions rather than a custom section or block construct. We already have the spec mechanisms necessary to handle float nondeterminism due to our NaN propagation rules, so I expect it would be straightforward to introduce new floating point instructions that allow for more possible results.

@sunfishcode
Copy link
Member

As additional context, the JVM, for its part, is considering removing strictfp and only supporting the strict semantics.

@arunetm
Copy link
Contributor Author

arunetm commented Jan 12, 2021

the best way to make more permissive semantics possible would be to introduce new instructions rather than a custom section or block construct.

Introducing new instructions are useful, but it will not scale too well or remove the toolchain dependency completely. The instructions with good platform support like fma, reciprocal etc are good candidates for new instruction addition. There is considerable variety in hints/flags offered by native compilers.
-fassociative-math -ffast-math -fno-honor-nans -ffinate-math-only -fdenormal-fp-math -fno-strict-float-cast-overflow -fno-math-errno -fno-trapping-math ...
Most of these are hints to allow more aggressive fp optimizations lifting restrictions on algebraic transformations, nans, signed zeros, traps, rounding etc. Expressing all the useful flags as new instructions is not be ideal imo.

As additional context, the JVM, for its part, is considering removing strictfp and only supporting the strict semantics.

Thanks @sunfishcode for this added context! I didnt know about this new update and seems like the motivation is to consolidate math library variants. On a closer look Java strict-fp appear to be a bit more specialized and is not a good representation of the range of control points offered by gcc/clang. -ffast-math flags and associated flags seem to be quite popular in native repos from a quick github search. I plan to look into the uses of the above flags and the associated pargmas

dschuff added a commit to WebAssembly/meetings that referenced this issue Jan 13, 2021
Adding an agenda item to discuss WebAssembly/design#1393

Co-authored-by: Derek Schuff <dschuff@chromium.org>
@sunfishcode
Copy link
Member

sunfishcode commented Jan 21, 2021

Typical projects which use gcc/clang's -ffast-math are compiling to native code. Application developers using them typically test all the native code variants that they themselves build. And since all popular hardware ISAs have fully deterministic floating-point behavior, once the developers have tested those variants, they can be fairly sure that the behavior won't change for their users. Currently, WebAssembly's floating-point works this way too; it's fully deterministic, just like hardware ISAs, so existing long-standing developer assumptions about being able to add -ffast-math and test that it "works for them" are upheld.

A WebAssembly-level -ffast-math flag would mean that WebAssembly no longer resembles a hardware ISA in this regard, and no longer upholds these assumptions. Developers would test the binary they produce, but when shipping wasm to their users, their users should expect to be able to run it on different hardware or different engines. With fast-math-like flags at the WebAssembly level, WebAssembly wouldn't behave like an ISA, and users could see different floating-point results than the developer tested with.

It's also worth pointing out that projects using these flags aren't missing out when compiling to WebAssembly today. For example, many of the optimizations enabled by -fassociative-math are loop-oriented optimizations that LLVM is able to do before producing WebAssembly.

@aardappel
Copy link

Yes, any decisions that affect float precision/behavior should be baked into the Wasm module by the tools. New instructions can be added for behavior that cannot be expressed with the current ones.

@arunetm
Copy link
Contributor Author

arunetm commented Jan 22, 2021

We had a discussion on this topic at the last CG meeting with the new instruction addition as the alternative means to the goal. There seems to be a general interest in solving this through the latter direction. I don't have objections if developer expectations on performance gains can be upheld by toolchain optimizations and introducing missing instruction variants. In that case, there is no good need to justify propagating these flags to the runtime compromising Wasm's level of abstraction. It will be good to understand the semantic variant instructions that are necessary to reach this goal. We have a few new instructions identified in the context of SIMD which may need to be extended to match with the instructions tools needs to express fast-math flags fully. Will continue looking in that direction.

Thanks for the feedback.

@arunetm
Copy link
Contributor Author

arunetm commented Mar 1, 2021

Content used for CG discussion. Wasm ffast-math.pptx

@arunetm arunetm mentioned this issue Mar 1, 2021
@arunetm
Copy link
Contributor Author

arunetm commented Oct 4, 2021

Relaxed simd proposal subsumes most of the desired functionality of this proposal. Closing this issue in favor of Relaxed SIMD.

@arunetm arunetm closed this as completed Oct 4, 2021
Snektron added a commit to Snektron/zig that referenced this issue Oct 14, 2023
WebAssembly doesn't implement the correct semantics
for this operation. See WebAssembly/design#1393
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants