Comparing to infinity is buggy on x87 #72327

eduardosm · 2020-05-18T16:02:31Z

When code is compiled using the X87 FPU, comparing to infinity can lead to unexpected behavior.

I tried this code:

#[inline(never)]
fn get_num() -> f64 {
    let num: f64 = 1.0e300;
    // volatile is to avoid optimizations
    unsafe { std::ptr::read_volatile(&num) }
}

fn main() {
    let x = get_num();
    let y = get_num();
    let z = x * y;
    if z != f64::INFINITY && z != f64::NEG_INFINITY && !z.is_nan() {
        let exp = (z.to_bits() >> 52) & 0x7FF;
        assert!(exp != 0x7FF);
        println!("is finite, exp = {}", exp);
    } else {
        println!("is not finite");
    }
}

I expected to see this happen:

This should either print "infinite or nan" or "is finite, exp = ...". The assert! should never fail because the exponent bits are 0x7FF if and only if the number is infinity or nan, so the if condition should have been false.

Instead, this happened:

When compiled for X86 without SSE, the assert will fail. This could also cause safety issues if unsafe code relies on it.

The following command can be used to compile without SSE:

RUSTFLAGS="-C target-cpu=pentium" cargo run --target i686-unknown-linux-gnu

Meta

rustc --version --verbose:

rustc 1.43.1 (8d69840ab 2020-05-04)
binary: rustc
commit-hash: 8d69840ab92ea7f4d323420088dd8c9775f180cd
commit-date: 2020-05-04
host: x86_64-unknown-linux-gnu
release: 1.43.1
LLVM version: 9.0

and

rustc 1.45.0-nightly (a74d1862d 2020-05-14)
binary: rustc
commit-hash: a74d1862d4d87a56244958416fd05976c58ca1a8
commit-date: 2020-05-14
host: x86_64-unknown-linux-gnu
release: 1.45.0-nightly
LLVM version: 9.0

The text was updated successfully, but these errors were encountered:

hanna-kruppe · 2020-05-20T13:46:12Z

For reference, can you post the machine code this results in? I'm particularly curious if there's any calls to functions already built in the distributed libstd, since those would potentially suffer from ABI mismatches (see #63597) that could explain the weird behavior.

eduardosm · 2020-05-20T14:08:15Z

For reference, can you post the machine code this results in? I'm particularly curious if there's any calls to functions already built in the distributed libstd, since those would potentially suffer from ABI mismatches (see #63597) that could explain the weird behavior.

Sure, the assembly of the main function is (after removing the println!s to simplify it):

_ZN10float_test4main17h3d6b0b215bf83903E:
	.cfi_startproc
	pushl	%ebx
	.cfi_def_cfa_offset 8
	subl	$24, %esp
	.cfi_def_cfa_offset 32
	.cfi_offset %ebx, -8
	calll	.L9$pb
	.cfi_adjust_cfa_offset 4
.L9$pb:
	popl	%ebx
	.cfi_adjust_cfa_offset -4
.Ltmp6:
	addl	$_GLOBAL_OFFSET_TABLE_+(.Ltmp6-.L9$pb), %ebx
	calll	_ZN10float_test7get_num17hedf1e4650ddf3052E
	fstpl	16(%esp)
	calll	_ZN10float_test7get_num17hedf1e4650ddf3052E
	fldl	16(%esp)
	fmulp	%st, %st(1)
	fucom	%st(0)
	fnstsw	%ax
	sahf
	jp	.LBB9_5
	flds	.LCPI9_0@GOTOFF(%ebx)
	fucomp	%st(1)
	fnstsw	%ax
	sahf
	jae	.LBB9_5
	flds	.LCPI9_1@GOTOFF(%ebx)
	fxch	%st(1)
	fucom	%st(1)
	fstp	%st(1)
	fnstsw	%ax
	sahf
	jae	.LBB9_5
	fstpl	8(%esp)
	movl	12(%esp), %eax
	notl	%eax
	testl	$2146435072, %eax
	fldz
	je	.LBB9_4
.LBB9_5:
	fstp	%st(0)
	addl	$24, %esp
	.cfi_def_cfa_offset 8
	popl	%ebx
	.cfi_def_cfa_offset 4
	retl
.LBB9_4:
	.cfi_def_cfa_offset 32
	fstp	%st(0)
	calll	_ZN3std9panicking11begin_panic17h261cc8b487132e56E
	ud2

...

.LCPI9_0:
	.long	4286578688
.LCPI9_1:
	.long	2139095040

Generated with:

RUSTFLAGS="-C target-cpu=pentium" cargo rustc --release --target i686-unknown-linux-gnu -- --emit asm

I think I forgot to mention in the OP that this happens in both debug and release mode.

RalfJung · 2020-05-20T19:33:41Z

Uh-oh... should this be labelled "unsound"? Any time codegen'd behavior does not match the spec, that can be considered a soundness issue as well.

eduardosm · 2020-05-21T19:27:33Z

I just found out that this can be reproduced without RUSTFLAGS="-C target-cpu=pentium" using --target i586-unknown-linux-gnu .

ecstatic-morse · 2020-06-13T20:20:37Z

I was searching for open floating point issues related to #73328 and stumbled upon this one, which seems to have slipped through the cracks. Marking as "unsound" and "needs prioritization". I'm still not sure exactly what causes this.

cc @rust-lang/wg-prioritization

ranma42 · 2020-06-13T20:29:35Z

I believe this happens because x87 internally works on 80-bits numbers.
This format can handle numbers up to about 10^4932, which means that the result of the multiplication is not infinity (when it is on the FP stack), hence the comparisons do not succeed.
On the other hand, the code that extracts the exponent casts it to 64-bits, so the number is rounded (truncated?) to the positive infinity.

ranma42 · 2020-06-13T20:36:10Z

In a way, this is what is happening behind the scenes:

#[inline(never)]
fn get_num() -> f64 {
    let num: f64 = 1.0e30;
    // volatile is to avoid optimizations
    unsafe { std::ptr::read_volatile(&num) }
}

fn main() {
    let x = get_num();
    let y = get_num();
    let z = x * y;
    if z != f64::INFINITY && z != f64::NEG_INFINITY && !z.is_nan() {
        let exp = ((z as f32).to_bits() >> 23) & 0x7F;
        assert!(exp != 0x7F);
        println!("is finite, exp = {}", exp);
    } else {
        println!("is not finite");
    }
}

Note that in this case the downcast as f32 is explicit.

ecstatic-morse · 2020-06-13T21:06:31Z

Indeed. The underlying cause is clear. I wonder what we should do here, though? Does Rust currently guarantee that extended precision is not used for operations on f64? If so, this is technically a miscompilation. However, I don't know whether it's worth fixing. Maybe we should just document the status quo and move on?

RalfJung · 2020-06-14T07:32:33Z

ecstatic-morse added I-prioritize I-unsound boom labels

Shouldn't we rather tag #73328 instead of tagging the 4 issues that are all instances of the same problem?

ecstatic-morse · 2020-06-14T13:18:33Z

This is wholly distinct from #73328. There's no NaN anywhere in this program.

RalfJung · 2020-06-14T13:32:53Z

Oh, good point. I had missed that.

So this is truly an i686-only artifact caused by using the x87 instructions, and has nothing to do with LLVM's sloppiness around NaNs?

RalfJung · 2020-06-14T13:35:03Z

Could we argue that this is a bug in LLVM, or is there some statement in LLVM's LangRef that actually makes this a correct compilation of the LLVM IR rustc produces?

ecstatic-morse · 2020-06-14T18:55:21Z

Not sure. The LLVM reference says:

The binary format of half, float, double, and fp128 correspond to the IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 respectively.

I don't have a copy of the IEEE spec, so I don't know if sharing a "binary format" implies that they must not use a higher precision for intermediate results. I think that the IEEE-754 may explicitly allow for this, however? In any case, it's unlikely that LLVM will guarantee the same precision everywhere as long as x87 instructions are supported. C has had the same issue for 20 years now. The GCC wiki discusses what would be required to make floating point math for double predictably 64-bit on x87. Presumably this kind of approach was also considered by LLVM and rejected.

As long as 32-bit x86 is a tier 1 target, I believe we won't be able to guarantee at the language level that floating point math is always performed at 64-bit precision for f64 (same goes for f32). Instead, we will have to say something like "the precision of intermediate floating point calculations is platform-defined but rustc guarantees that all tier 1 platforms besides i686 use the same precision as the floating-point type for all computations".

RalfJung · 2020-06-14T19:07:19Z

As long as 32-bit x86 is a tier 1 target, I believe we won't be able to guarantee at the language level that floating point math is always performed at 64-bit precision for f64 (same goes for f32).

We could require SSE. (No idea if that is even remotely realistic, but it felt worth mentioning.)

ecstatic-morse · 2020-06-14T19:15:04Z

The original SSE only supported single-precision floating point arithmetic. You need at least SSE2 for the semantics we want. I believe that all i686 processors have SSE but not all have SSE2. Not sure what we do by default here?

But yes, I think that would be fine.

hanna-kruppe · 2020-06-14T19:23:18Z

FWIW, the target triples i686-* are a misnomer, they include SSE2 already.

comex · 2020-06-15T00:07:20Z

See also: rust-lang/rfcs#2686 (“Allow floating-point operations to provide extra precision than specified, as an optimization”)

RalfJung · 2020-06-15T07:42:49Z

@hanna-kruppe Ah, that's why the OP had to do RUSTFLAGS="-C target-cpu=pentium".
Can we declare non-SSE2-i686 "unsupported" to "fix" the problem here? I am not sure what that would actually mean in practice, maybe error or at least warn when SSE2 is disabled, or so. Without proper hardware support, it seems hard to do anything better, and if it's a warning people can still proceed with caution.

@comex there were many concerns in that RFC about enabling such optimizations by default (rust-lang/rfcs#2686 (comment), rust-lang/rfcs#2686 (comment)). Also, would that RFC really lead to problems such as this? f64 is still a 64-bit type, so the assertion here should still hold, right? The issue arises because f64 is actually represented as 80bits on x87 and thus there are more possible bit patterns than there should be?

RalfJung · 2020-08-07T10:55:53Z

Is this related to or a duplicate of #73288? Both are about x87 floating point weirdness, from what I can tell.

workingjubilee · 2021-04-29T19:45:47Z

Related to (in the sense of "it's about x86 and the x87 FPU"), but not a duplicate of, #73288.

Muon · 2022-12-30T07:04:47Z

I don't have a copy of the IEEE spec, so I don't know if sharing a "binary format" implies that they must not use a higher precision for intermediate results. I think that the IEEE-754 may explicitly allow for this, however?

I do! IEEE 754 doesn't directly concern itself with expression evaluation or intermediates. It only mandates how individual arithmetical operations are to be performed. The existence of the x87's 80-bit format is not even mentioned in the spec. Section 11 (Reproducible floating-point results) of IEEE 754-2019 concerns itself with reproducible programming, but that's an opt-in. Section 11 stipulates that the reproducible operations are those defined in Section 5 (Operations). The definition of arithmetic operations in Section 5 requires that they are correctly rounded for their destination format. That means that if you are claiming to be producing a 64-bit floating-point product according to the spec's definition of multiplication, you must use that for further operations, and not use your higher intermediate precision to perform further operations on it.

It's possible to cajole the x87 unit into compliance with IEEE 754, but it can be painful. You need to:

Set the precision flag to match your target precision (example: https://godbolt.org/z/ETKx7rMPb).
Make sure the exponents of intermediates stay within your destination's exponent range. In the worst case, this can be accomplished by storing and reloading intermediates before they are used again.
Ensure signaling NaNs are in the 80-bit format before loading them into the x87 registers. As long as you only load/store 80-bit floats onto the x87 stack, all bits are preserved (example: https://godbolt.org/z/qoafPxPPa). Luckily, this usually doesn't matter anyway since signalling NaNs must be quieted if you actually do anything with them. Mostly, just don't use the x87 registers for temporary storage of floats that might be signalling NaNs.

Unfortunately, the best way to use the x87 is to stick to its native 80-bit format, which is not portable.

RalfJung · 2023-06-28T11:22:27Z

I propose that we close this issue by documenting this as a known non-compliance on x86-32 targets: #113053.

Muon · 2023-06-28T12:22:56Z

I'm not sure this should be closed. This is an LLVM codegen bug, the same bug as in llvm/llvm-project#44218.

RalfJung · 2023-06-28T12:42:51Z

Ah, that's a fun one -- is that exploiting the fact that LLVM optimizes as if floats work according to IEEE 754, but then at runtime they behave differently because x87?

rustc's own MIR optimizations might well do something similar. Fundamentally this is still a target bug, the optimization is just correctly doing source-level reasoning -- but the backend fails to correctly implement the source behavior.

Muon · 2023-06-28T14:48:19Z

What's happening in the linked issue is that the sum is initially computed with 80-bit precision, but the comparison is evaluated twice with different precisions. The value of z passed to printf is evaluated at 80-bit precision, and then stored on the stack to be passed as a vararg. Calling convention means the x87 registers are not preserved, so the sum that went into the comparison gets spilled to a 64 bit floating point temporary. After the printf, the comparison then gets repeated using the spilled (now rounded) value, because LLVM erroneously believes that this is the same value as was in the floating point register.

RalfJung · 2023-08-04T18:53:25Z

Closing in favor of a dedicated tracking issue that summarizes what we know about the issue: #114479.

…bilee add notes about non-compliant FP behavior on 32bit x86 targets Based on ton of prior discussion (see all the issues linked from rust-lang/unsafe-code-guidelines#237), the consensus seems to be that these targets are simply cursed and we cannot implement the desired semantics for them. I hope I properly understood what exactly the extent of the curse is here, let's make sure people with more in-depth FP knowledge take a close look! In particular for the tier 3 targets I have no clue which target is affected by which particular variant of the x86_32 FP curse. I assumed that `i686` meant SSE is used so the "floating point return value" is the only problem, while everything lower (`i586`, `i386`) meant x87 is used. I opened rust-lang#114479 to concisely describe and track the issue. Cc `@workingjubilee` `@thomcc` `@chorman0773` `@rust-lang/opsem` Fixes rust-lang#73288 Fixes rust-lang#72327

Rollup merge of rust-lang#113053 - RalfJung:x86_32-float, r=workingjubilee add notes about non-compliant FP behavior on 32bit x86 targets Based on ton of prior discussion (see all the issues linked from rust-lang/unsafe-code-guidelines#237), the consensus seems to be that these targets are simply cursed and we cannot implement the desired semantics for them. I hope I properly understood what exactly the extent of the curse is here, let's make sure people with more in-depth FP knowledge take a close look! In particular for the tier 3 targets I have no clue which target is affected by which particular variant of the x86_32 FP curse. I assumed that `i686` meant SSE is used so the "floating point return value" is the only problem, while everything lower (`i586`, `i386`) meant x87 is used. I opened rust-lang#114479 to concisely describe and track the issue. Cc `@workingjubilee` `@thomcc` `@chorman0773` `@rust-lang/opsem` Fixes rust-lang#73288 Fixes rust-lang#72327

eduardosm added the C-bug Category: This is a bug. label May 18, 2020

jonas-schievink added A-codegen Area: Code generation O-x86 T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 18, 2020

ecstatic-morse added I-prioritize Issue: Indicates that prioritization has been requested for this issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Jun 13, 2020

ecstatic-morse mentioned this issue Jun 14, 2020

Document guarantees (or lack thereof) regarding sign, quietness, and payload of NaNs #73328

Open

RalfJung mentioned this issue Jun 14, 2020

Our floating point semantics were a mess rust-lang/unsafe-code-guidelines#237

Open

RalfJung added the A-floating-point Area: Floating point numbers and arithmetic label Jun 16, 2020

spastorino added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jun 17, 2020

RalfJung mentioned this issue Sep 9, 2020

Tracking Issue for total_cmp (on f32/f64) #72599

Closed

6 tasks

gburgessiv mentioned this issue Feb 25, 2023

Add known-bug tests for all(?) I-unsound issues #105107

Open

workingjubilee added the A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. label Mar 3, 2023

RalfJung mentioned this issue Jun 28, 2023

add notes about non-compliant FP behavior on 32bit x86 targets #113053

Merged

RalfJung mentioned this issue Aug 4, 2023

Tracking issue: 32bit x86 targets without SSE2 have unsound floating point behavior #114479

Open

RalfJung closed this as completed Aug 4, 2023

Nilstrieb added O-x86_32 Target: x86 processors, 32 bit (like i686-*) and removed O-x86-all labels Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing to infinity is buggy on x87 #72327

Comparing to infinity is buggy on x87 #72327

eduardosm commented May 18, 2020

hanna-kruppe commented May 20, 2020

eduardosm commented May 20, 2020 •

edited

RalfJung commented May 20, 2020

eduardosm commented May 21, 2020

ecstatic-morse commented Jun 13, 2020 •

edited

ranma42 commented Jun 13, 2020

ranma42 commented Jun 13, 2020 •

edited

ecstatic-morse commented Jun 13, 2020

RalfJung commented Jun 14, 2020

ecstatic-morse commented Jun 14, 2020

RalfJung commented Jun 14, 2020

RalfJung commented Jun 14, 2020

ecstatic-morse commented Jun 14, 2020 •

edited

RalfJung commented Jun 14, 2020

ecstatic-morse commented Jun 14, 2020 •

edited

hanna-kruppe commented Jun 14, 2020

comex commented Jun 15, 2020 •

edited

RalfJung commented Jun 15, 2020

RalfJung commented Aug 7, 2020 •

edited

workingjubilee commented Apr 29, 2021 •

edited

Muon commented Dec 30, 2022 •

edited

RalfJung commented Jun 28, 2023

Muon commented Jun 28, 2023

RalfJung commented Jun 28, 2023

Muon commented Jun 28, 2023 •

edited

RalfJung commented Aug 4, 2023

Comparing to infinity is buggy on x87 #72327

Comparing to infinity is buggy on x87 #72327

Comments

eduardosm commented May 18, 2020

Meta

hanna-kruppe commented May 20, 2020

eduardosm commented May 20, 2020 • edited

RalfJung commented May 20, 2020

eduardosm commented May 21, 2020

ecstatic-morse commented Jun 13, 2020 • edited

ranma42 commented Jun 13, 2020

ranma42 commented Jun 13, 2020 • edited

ecstatic-morse commented Jun 13, 2020

RalfJung commented Jun 14, 2020

ecstatic-morse commented Jun 14, 2020

RalfJung commented Jun 14, 2020

RalfJung commented Jun 14, 2020

ecstatic-morse commented Jun 14, 2020 • edited

RalfJung commented Jun 14, 2020

ecstatic-morse commented Jun 14, 2020 • edited

hanna-kruppe commented Jun 14, 2020

comex commented Jun 15, 2020 • edited

RalfJung commented Jun 15, 2020

RalfJung commented Aug 7, 2020 • edited

workingjubilee commented Apr 29, 2021 • edited

Muon commented Dec 30, 2022 • edited

RalfJung commented Jun 28, 2023

Muon commented Jun 28, 2023

RalfJung commented Jun 28, 2023

Muon commented Jun 28, 2023 • edited

RalfJung commented Aug 4, 2023

eduardosm commented May 20, 2020 •

edited

ecstatic-morse commented Jun 13, 2020 •

edited

ranma42 commented Jun 13, 2020 •

edited

ecstatic-morse commented Jun 14, 2020 •

edited

ecstatic-morse commented Jun 14, 2020 •

edited

comex commented Jun 15, 2020 •

edited

RalfJung commented Aug 7, 2020 •

edited

workingjubilee commented Apr 29, 2021 •

edited

Muon commented Dec 30, 2022 •

edited

Muon commented Jun 28, 2023 •

edited