32-bit Windows cannot handle Rfloat::na() #321

yutannihilation · 2021-11-10T23:58:57Z

failures:

---- src\scalar\rfloat.rs - scalar::rfloat::Rfloat (line 35) stdout ----
Test executable failed (exit code 101).

stderr:
thread 'main' panicked at 'assertion failed: (<Rfloat>::na()).is_na()', src\scalar\rfloat.rs:6:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


---- src\wrapper\doubles.rs - wrapper::doubles::Doubles::elt (line 25) stdout ----
Test executable failed (exit code 101).

stderr:
thread 'main' panicked at 'assertion failed: vec.elt(10).is_na()', src\wrapper\doubles.rs:9:4
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

For this reason, one ALTREP-related test is disabled. Let's not forget to re-enable this as well.

473ac05

The text was updated successfully, but these errors were encountered:

Ilia-Kosenkov · 2021-11-11T08:39:26Z

I will try to take a look at this.

andy-thomason · 2021-11-11T13:15:50Z

It is possible that it is converting to the legacy FPU type (f80) which would collapse all NaN values into one. NA_REAL is a special NaN which we test explicitly by looking at the bit pattern. Andy.

…

On Thu, Nov 11, 2021 at 8:39 AM Ilia ***@***.***> wrote: I will try to take a look at this. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#321 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAL36XBNZTU2AMVB6KH7ZFLULN6MTANCNFSM5HZFWTPQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Ilia-Kosenkov · 2021-11-11T18:06:46Z

So the issue is rather weird. Here is what I found:
To test for NA, we use a libR_sys constant libR_sys::R_NaReal, which is then compared to f64 bitwise (because it is a NaN, which is not comparable to any other NaN or itself).

So in extendr tests I retrieved Rfloat::na().0 and libR_sys::R_NaReal and printed the output as i64 using .to_bits().

x64

9218868437227407266  # libR_sys::R_NaReal
9218868437227407266  # Rfloat::na()

x86

9221120237041092514 # libR_sys::R_NaReal
9218868437227407266 # Rfloat::na()

To me it seems that somehow when compiling for i686, extendr-api references 64-bit bindings, yet the linked bindings are 32-bit. Hence, libR_sys values changes while Rfloat::na() does not.

Ilia-Kosenkov · 2021-11-11T18:31:00Z

x64

 & "$env:R_HOME\bin\x64\Rscript.exe" -e "rextendr::rust_eval('rprintln!(\`"{:?}\`", unsafe { libR_sys::R_NaReal.to_bits()});', dependencies = list(``libR-sys`` = '0.2.2'), extendr_deps = list(``extendr-api`` = list(git = 'https://github.com/extendr/extendr', branch = 'master')))"

9218868437227407266

x86

 & "$env:R_HOME\bin\i386\Rscript.exe" -e "rextendr::rust_eval('rprintln!(\`"{:?}\`", unsafe { libR_sys::R_NaReal.to_bits()});', dependencies = list(``libR-sys`` = '0.2.2'), extendr_deps = list(``extendr-api`` = list(git = 'https://github.com/extendr/extendr', branch = 'master')))"

9218868437227407266

Extremely strange.

Ilia-Kosenkov · 2021-11-11T18:39:02Z

fn main() {
    extendr_engine::start_r();
    println!("{:?}", unsafe { libR_sys::R_NaReal.to_bits() });
    extendr_engine::end_r();
}

This thing works correctly for both architectures and yields 9218868437227407266.

Ilia-Kosenkov · 2021-11-11T19:07:04Z

{cpp11} for both architectures correctly retrieves value of NA:

 writable::integers get_na() {
    auto na = R_NaReal;
    auto na_as_ulong = *((unsigned long long*)&na);
    writable::integers out(2);
    out[0] = (int)(na_as_ulong >> 32);
    out[1] = (int)(na_as_ulong);
    return out;
 }

> get_na()
[1] 2146435072       1954

NA is essentially 0x7ff00000 << 32 | 1954.

Ilia-Kosenkov · 2021-11-11T19:29:02Z

Soooooo, it seems there is some internal magic in Rust about NaN handling.
The issue seems to be resolved if #[repr(C)]...

~~And I cannot reproduce this behaviour under different conditions.~~
This turned out to be incorrect

Ilia-Kosenkov · 2021-11-23T10:57:50Z

The difference between 32- and 64- bit NAs is in the high DWORD.
The low part (DWORD) is 1954u32 in both cases, but the high part differs:

x64: 0x7ff00000u64 << 32 | 1954 = 9218868437227407266
x86: 0x7ff80000u64 << 32 | 1954 = 9221120237041092514
I have absolutely no idea why this happens. The difference is one bit (0 vs 8 in hex notation).

UPD: I think I found an explanation but not yet the solution...

andy-thomason · 2021-11-23T11:13:49Z

You may have found a rust bug! Ross Ihaka's birthday breaks the build.

https://en.wikipedia.org/wiki/Ross_Ihaka

andy-thomason · 2021-11-23T11:33:58Z

Could you formulate this into a 3 line example without dependencies so that we can submit a bug report?

Ilia-Kosenkov · 2021-11-23T11:58:58Z

@andy-thomason, this is not a bug. Check out the explanation I linked above. In simple terms, when we compile for x86 we observe FPU/CPU being 'smart' and setting NAN silent bit when reinterpreting u64 to f64 and back again. This 'feature' was found by the devs working on ~~wasm compiler~~ wasmi experimental interpreter -- they got the same problem of roundtripping i32 to f32 and back.
One of the devs claims that this allows to convert floats to integers and back without acquiring an extra silent NaN bit.

So far it seems that it happens during compilation of extendr-api, when we access NA literal from libR-sys. I used .to_bits() to obtain the representation and it adds the 'silent' NaN bit.

I tried creating a tiny Rust library referencing libR-sys and extendr-engine, and accessing NA constant after compiling for x86_64 and i686 -- in both cases I got correct values.

Anyway, I will dig deeper when I have time, but at least now I have some understanding of what is going on (I was afraid we are having some sort of memory corruption or non-atomic access to 64-bit value when running 32-bit).

See also Wiki

Ilia-Kosenkov · 2021-11-23T12:39:12Z

@andy-thomason, you were right -- there is a four-line reproducible example, unbelievable.
I need to point out that to compile this we use MSYS2, so it may be related to the libraries/compilers provided by mingw32. Same seems to happen if I use mingw32 toolchain from rtools40.

fn main() {
    let expected_na_val = 0x7ff00000u64 << 32 | 1954;
    assert_eq!(expected_na_val, f64::from_bits(expected_na_val).to_bits());
}

cargo run --target i686-pc-windows-gnu

    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `C:\Users\[redacted]\AppData\Local\Temp\cargo_temp\i686-pc-windows-gnu\debug\na_test.exe`
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `9218868437227407266`,
 right: `9221120237041092514`', src\main.rs:3:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: process didn't exit successfully: `C:\Users\[redacted]\AppData\Local\Temp\cargo_temp\i686-pc-windows-gnu\debug\na_test.exe` (exit code: 101)

UPD: I also tested using different toolchains and no cross-compilation, same result

Toolchain	Target
`stable-i686-pc-windows-gnu`	`i686-pc-windows-gnu`
`stable-x86_64-pc-windows-gnu`	`i686-pc-windows-gnu`
`nightly-x86_64-pc-windows-msvc`	`i686-pc-windows-gnu`
`nightly-x86_64-pc-windows-msvc`	`i686-pc-windows-msvc`
`stable-i686-pc-windows-msvc`	`i686-pc-windows-msvc`

Ilia-Kosenkov · 2021-11-23T13:36:54Z

rust-lang/rust#73328

andy-thomason · 2021-11-23T15:22:00Z

Or

rust-lang/rust#73288

Which seems to exactly this thing!

As I suspect, there is an 8087 register somewhere in the mix.

If CRAN drop x86 support, this should go away with luck.

Ilia-Kosenkov · 2021-11-23T15:38:38Z

Yeah, it definitely will go away, as well as a ton of other issues, as we will likely no longer cross-compile. Yet for now, I wonder if I can make it work correctly.

Ilia-Kosenkov · 2021-11-24T12:33:29Z

So my survey of the topic revealed the following:
According to Wikipedia,

signalling NaN is represented as

0 11111111111 0000000000000000000000000000000000000000000000000001 ≙ 7FF0 0000 0000 0001 ≙ NaN (sNaN on most processors, such as x86 and ARM)

quiet NaN is represented as

0 11111111111 1000000000000000000000000000000000000000000000000001 ≙ 7FF8 0000 0000 0001 ≙ NaN (qNaN on most processors, such as x86 and ARM)

The payload (mantissa) should be non-zero, so in R it is the birth year of the author (1954).
What happens is R returns a signalling NaN (highest mantissa bit set to 0), but our bit manipulations turn it into quiet NaN (highest bit set to 1).
Because NA - NaN is 'unique (uses special payload of 1954), we can easily distinguish between NA and NaN in Rust, but I am concerned about the returning value -- we need to make sure we return bitwise-correct NA to R.

As soon as we finish with TryFrom, we need to update extendrtests and verify various conversions, including generating and processing NA_real_ in Rust and returning it to R to verify its bit representation.

yutannihilation mentioned this issue Nov 11, 2021

Let ALTREPs return Rint and Rfloat #318

Merged

Ilia-Kosenkov added arch-x86 x86-related issues bug Something isn't working os-Windows Windows-specific problems labels Nov 11, 2021

Ilia-Kosenkov mentioned this issue Nov 12, 2021

Running doctests on Windows #323

Merged

Ilia-Kosenkov mentioned this issue Nov 24, 2021

Fixing NA_real_ comparison on windows-x86 #328

Merged

Ilia-Kosenkov closed this as completed in #328 Dec 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

32-bit Windows cannot handle Rfloat::na() #321

32-bit Windows cannot handle Rfloat::na() #321

yutannihilation commented Nov 10, 2021

Ilia-Kosenkov commented Nov 11, 2021

andy-thomason commented Nov 11, 2021 via email

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021 •

edited

Ilia-Kosenkov commented Nov 23, 2021

andy-thomason commented Nov 23, 2021

andy-thomason commented Nov 23, 2021

Ilia-Kosenkov commented Nov 23, 2021 •

edited

Ilia-Kosenkov commented Nov 23, 2021

Ilia-Kosenkov commented Nov 23, 2021

andy-thomason commented Nov 23, 2021

Ilia-Kosenkov commented Nov 23, 2021

Ilia-Kosenkov commented Nov 24, 2021

32-bit Windows cannot handle Rfloat::na() #321

32-bit Windows cannot handle Rfloat::na() #321

Comments

yutannihilation commented Nov 10, 2021

Ilia-Kosenkov commented Nov 11, 2021

andy-thomason commented Nov 11, 2021 via email

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021

Ilia-Kosenkov commented Nov 11, 2021 • edited

Ilia-Kosenkov commented Nov 23, 2021

andy-thomason commented Nov 23, 2021

andy-thomason commented Nov 23, 2021

Ilia-Kosenkov commented Nov 23, 2021 • edited

Ilia-Kosenkov commented Nov 23, 2021

Ilia-Kosenkov commented Nov 23, 2021

andy-thomason commented Nov 23, 2021

Ilia-Kosenkov commented Nov 23, 2021

Ilia-Kosenkov commented Nov 24, 2021

Ilia-Kosenkov commented Nov 11, 2021 •

edited

Ilia-Kosenkov commented Nov 23, 2021 •

edited