Redundant checks in floating point to integer casts on WASM #73591

CryZe · 2020-06-21T19:31:04Z

If you compile the following Rust code to WASM:

pub unsafe fn cast(x: f64) -> u8 {
    x.to_int_unchecked()
}

it compiles to the following with Rust 1.45:

example::cast:
        block           
        local.get       0
        f64.const       0x1p32
        f64.lt  
        local.get       0
        f64.const       0x0p0
        f64.ge  
        i32.and 
        i32.eqz
        br_if           0
        local.get       0
        i32.trunc_f64_u
        return
        end_block
        i32.const       0
        end_function

The same happens in Rust <1.45 if you use as _. In both cases the following LLVM IR is emitted:

define hidden i8 @_ZN7example4cast17he66904755097d668E(double %x) unnamed_addr #0 !dbg !5 {
  %0 = fptoui double %x to i8, !dbg !9
  ret i8 %0, !dbg !22
}

fptoui is explicitly defined to be UB if an out of range value is provided.

If the value cannot fit in ty2, the result is a poison value.

However it seems like the WASM backend in LLVM ignores that and emits additional bounds checks.

Additionally this gets even worse in Rust 1.45 if you use as _ instead:

example::cast:
        local.get       0
        f64.const       0x1.fep7
        f64.gt  
        local.set       1
        block           
        block           
        local.get       0
        f64.const       0x0p0
        local.get       0
        f64.const       0x0p0
        f64.gt  
        f64.select
        local.tee       0
        f64.const       0x1p32
        f64.lt  
        local.get       0
        f64.const       0x0p0
        f64.ge  
        i32.and 
        i32.eqz
        br_if           0
        local.get       0
        i32.trunc_f64_u
        local.set       2
        br              1
        end_block
        i32.const       0
        local.set       2
        end_block
        i32.const       -1
        local.get       2
        local.get       1
        i32.select
        end_function

This means that in the end in the WASM VM will do 3 range checks (the one emitted by rust, the one emitted by the llvm backend, and the one it needs to do to possibly trap).

Additionally Rust's saturating checks don't play well with WASM's nontrapping-fptoint target-feature as there's still redundant checks:

example::cast:
        i64.const       9223372036854775807
        local.get       0
        f32.const       -0x1p63
        local.get       0
        f32.const       -0x1p63
        f32.gt  
        f32.select
        i64.trunc_sat_f32_s
        local.get       0
        f32.const       0x1.fffffep62
        f32.gt  
        i64.select
        i64.const       0
        local.get       0
        local.get       0
        f32.eq  
        i64.select
        end_function

The text was updated successfully, but these errors were encountered:

CryZe · 2020-06-22T11:29:19Z

It seems to come down to this commit by @sunfishcode in LLVM llvm/llvm-project@cdd48b8

I fail to understand the justification here:

// Lower an fp-to-int conversion operator from the LLVM opcode, which has an
// undefined result on invalid/overflow, to the WebAssembly opcode, which
// traps on invalid/overflow.

If the LLVM opcode doesn't define the behavior for invalid results, then why is the WASM backend trying to paper over the UB and instead introduces checks to supposedly trap? (which I don't think it even does, it just returns dummy values)

sunfishcode · 2020-06-22T12:26:28Z

In the context of that comment, "undefined result" doesn't mean "undefined behavior". It means the operation produces a return value, without trapping, but the specific value is undefined. Since's the wasm operator in question traps in some questions, extras checks are required to prevent trapping, to emulate the LLVM IR semantics.

#71269 has now landed, so Rust's semantics for fp-to-int conversion match those of wasm's nontrapping-fptoint feature, which is now part of wasm's core spec, though not all engines implement it yet. So for anyone interested in working on this, the path forward is to teach the Rust backend to take advantage of this wasm feature, when present, to avoid emitting the extra range checks.

CryZe · 2020-06-22T13:05:01Z

So from what I'm understanding from more code, LLVM could be reordering fptoui / fptosi to be speculatively executed earlier, which would mean the code traps when it possibly shouldn't have, as that cast may have never been executed. Is that correct?

There also seem to be LLVM intrinsics in the WASM backend now such as i32 @llvm.wasm.trunc.signed.i32.f32(float). Can those be speculatively executed? If no, the code should probably be lowered like this:

	`-nontrapping-fptoint`	`+nontrapping-fptoint`
`.to_int_unchecked()`	`llvm.wasm.trunc.*`	`llvm.wasm.trunc.*`
`as _`	`llvm.wasm.trunc.*` + manual saturation checks	`llvm.wasm.trunc.saturate.*`

If yes, it should probably be this:

	`-nontrapping-fptoint`	`+nontrapping-fptoint`
`.to_int_unchecked()`	`fptosi` / `fptoui`	`llvm.wasm.trunc.saturate.*`
`as _`	`llvm.wasm.trunc.*` + manual saturation checks (or maybe that intrinsic is still dangerous in that case?)	`llvm.wasm.trunc.saturate.*`

sunfishcode · 2020-06-22T13:26:51Z

Yes, LLVM can reorder and speculate fptosi/fptoui as it assumes they don't trap.

@llvm.wasm.trunc.signed.i32.f32 can trap, so it cannot be speculatively executed in general.

…crichton Use WASM's saturating casts if they are available WebAssembly supports saturating floating point to integer casts behind a target feature. The feature is already available on many browsers. Beginning with 1.45 Rust will start defining the behavior of floating point to integer casts to be saturating as well. For this Rust constructs additional checks on top of the `fptoui` / `fptosi` instructions it emits. Here we introduce the possibility for the codegen backend to construct saturating casts itself and only fall back to constructing the checks ourselves if that is not possible. Resolves part of rust-lang#73591

This commit improves codegen for unchecked casts on WebAssembly targets to use the singluar `iNN.trunc_fMM_{u,s}` instructions. Previously rustc would codegen a bare `fptosi` and `fptoui` for float casts but for WebAssembly targets the codegen for these instructions is quite large. This large codegen is due to the fact that LLVM can speculate these instructions so the trapping behavior of WebAssembly needs to be protected against in case they're speculated. The change here is to update the codegen for the unchecked cast intrinsics to have a wasm-specific case where they call the appropriate LLVM intrinsic to generate the right wasm instruction. The intrinsic is explicitly opting-in to undefined behavior so a trap here for out-of-bounds inputs on wasm should be acceptable. cc rust-lang#73591

alexcrichton · 2020-07-22T22:07:03Z

I've posted #74659 to improve the unchecked -nontrapping-fptoint case

Improve codegen for unchecked float casts on wasm This commit improves codegen for unchecked casts on WebAssembly targets to use the singluar `iNN.trunc_fMM_{u,s}` instructions. Previously rustc would codegen a bare `fptosi` and `fptoui` for float casts but for WebAssembly targets the codegen for these instructions is quite large. This large codegen is due to the fact that LLVM can speculate these instructions so the trapping behavior of WebAssembly needs to be protected against in case they're speculated. The change here is to update the codegen for the unchecked cast intrinsics to have a wasm-specific case where they call the appropriate LLVM intrinsic to generate the right wasm instruction. The intrinsic is explicitly opting-in to undefined behavior so a trap here for out-of-bounds inputs on wasm should be acceptable. cc rust-lang#73591

alexcrichton · 2020-07-23T19:36:08Z

Ok, I've improved the last case at #74695 and I think that's good enough to close this out.

@1

…es, r=nagisa rustc: Improving safe wasm float->int casts This commit improves code generation for WebAssembly targets when translating floating to integer casts. This improvement is only relevant when the `nontrapping-fptoint` feature is not enabled, but the feature is not enabled by default right now. Additionally this improvement only affects safe casts since unchecked casts were improved in rust-lang#74659. Some more background for this issue is present on rust-lang#73591, but the general gist of the issue is that in LLVM the `fptosi` and `fptoui` instructions are defined to return an `undef` value if they execute on out-of-bounds values; they notably do not trap. To implement these instructions for WebAssembly the LLVM backend must therefore generate quite a few instructions before executing `i32.trunc_f32_s` (for example) because this WebAssembly instruction traps on out-of-bounds values. This codegen into wasm instructions happens very late in the code generator, so what ends up happening is that rustc inserts its own codegen to implement Rust's saturating semantics, and then LLVM also inserts its own codegen to make sure that the `fptosi` instruction doesn't trap. Overall this means that a function like this: #[no_mangle] pub unsafe extern "C" fn cast(x: f64) -> u32 { x as u32 } will generate this WebAssembly today: (func $cast (type 0) (param f64) (result i32) (local i32 i32) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.gt local.set 1 block ;; label = @1 block ;; label = @2 local.get 0 f64.const 0x0p+0 (;=0;) local.get 0 f64.const 0x0p+0 (;=0;) f64.gt select local.tee 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@2;) local.get 0 i32.trunc_f64_u local.set 2 br 1 (;@1;) end i32.const 0 local.set 2 end i32.const -1 local.get 2 local.get 1 select) This PR improves the situation by updating the code generation for float-to-int conversions in rustc, specifically only for WebAssembly targets and only for some situations (float-to-u8 still has not great codegen). The fix here is to use basic blocks and control flow to avoid speculatively executing `fptosi`, and instead LLVM's raw intrinsic for the WebAssembly instruction is used instead. This effectively extends the support added in rust-lang#74659 to checked casts. After this commit the codegen for the above Rust function looks like: (func $cast (type 0) (param f64) (result i32) (local i32) block ;; label = @1 local.get 0 f64.const 0x0p+0 (;=0;) f64.ge local.tee 1 i32.const 1 i32.xor br_if 0 (;@1;) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.le i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const -1 i32.const 0 local.get 1 select) For reference, in Rust 1.44, which did not have saturating float-to-integer casts, the codegen LLVM would emit is: (func $cast (type 0) (param f64) (result i32) block ;; label = @1 local.get 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const 0) So we're relatively close to the original codegen, although it's slightly different because the semantics of the function changed where we're emulating the `i32.trunc_sat_f32_s` instruction rather than always replacing out-of-bounds values with zero. There is still work that could be done to improve casts such as `f32` to `u8`. That form of cast still uses the `fptosi` instruction which generates lots of branch-y code. This seems less important to tackle now though. In the meantime this should take care of most use cases of floating-point conversion and as a result I'm going to speculate that this... Closes rust-lang#73591

CryZe added the C-bug Category: This is a bug. label Jun 21, 2020

jonas-schievink added I-slow Issue: Problems and improvements with respect to performance of generated code. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. labels Jun 21, 2020

CryZe mentioned this issue Jun 25, 2020

Use WASM's saturating casts if they are available #73724

Merged

alexcrichton mentioned this issue Jul 22, 2020

Improve codegen for unchecked float casts on wasm #74659

Merged

alexcrichton mentioned this issue Jul 23, 2020

rustc: Improving safe wasm float->int casts #74695

Merged

bors closed this as completed in 2c1b046 Aug 4, 2020

CryZe mentioned this issue Jul 30, 2021

WASM float to int performance regression since 1.53.0 #87643

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redundant checks in floating point to integer casts on WASM #73591

Redundant checks in floating point to integer casts on WASM #73591

CryZe commented Jun 21, 2020

CryZe commented Jun 22, 2020 •

edited

sunfishcode commented Jun 22, 2020

CryZe commented Jun 22, 2020 •

edited

sunfishcode commented Jun 22, 2020

alexcrichton commented Jul 22, 2020

alexcrichton commented Jul 23, 2020

Redundant checks in floating point to integer casts on WASM #73591

Redundant checks in floating point to integer casts on WASM #73591

Comments

CryZe commented Jun 21, 2020

CryZe commented Jun 22, 2020 • edited

sunfishcode commented Jun 22, 2020

CryZe commented Jun 22, 2020 • edited

sunfishcode commented Jun 22, 2020

alexcrichton commented Jul 22, 2020

alexcrichton commented Jul 23, 2020

CryZe commented Jun 22, 2020 •

edited

CryZe commented Jun 22, 2020 •

edited