Floating point fill values' endianness #279

clbarnes · 2023-11-01T15:17:27Z

Following on from #236

IEEE754 doesn't specify an endianness for float representations - does this mean that the hex string representation of the fill value of a float dataset is dependent on the endianness of the codecs? If so, it would be much more convenient to just say that it's always of a particular endianness.

jbms · 2023-11-01T15:22:04Z

No, the hex string always has the sign bit as the most significant bit (i.e. first) and does not depend on endianness. Perhaps you can create a PR to clarify.

clbarnes · 2023-11-01T16:13:50Z

Is that an implementation detail of the C function referenced in the spec?

jbms · 2023-11-01T16:38:36Z

Is that an implementation detail of the C function referenced in the spec?

No, and actually the warning about strtod was in relation to the NaN syntax nan(1234) that I previously proposed but was rejected.

strtod accepts the "OxYYYYYYYY[.ZZZZZZ]" hex floating point syntax which has a different meaning. Unfortunately strtod does not guarantee that every distinct NaN value has a corresponding string representation so we can't rely on the strtod spec.

I intended to convey what I said in #279 (comment) with the language "specifying the byte representation of the floating point number as an unsigned integer", where I was assuming the usual endian-agnostic representation of the floating point number as a sequence of bits, where the first (most significant) bit is the sign bit, followed by the exponent bits, followed by the mantissa bits. The NaN example also serves to clarify. Perhaps there is a better way to state it, though.

clbarnes · 2023-11-01T17:58:28Z

the usual endian-agnostic representation of the floating point number

This norm is what I was struggling to find details of, just came up with ambiguity e.g. https://stackoverflow.com/questions/2945174/floating-point-endianness

clbarnes · 2023-11-01T19:31:07Z

Writing the PR using this language

where the first (most significant) bit is the sign bit, followed by the exponent bits, followed by the mantissa bits

and had another question - different languages may default to different NaN values when using their respective NaN-creation routines. Are we taking a "NaN" fill to mean that any NaN value is valid, or are we specifying a specific NaN as implied by the example in the "0x..." point? If the former, implementations probably shouldn't ever write "NaN" (opting for the byte string instead) because they don't necessarily know the intention of other readers/writers. The alternative is to disallow specific NaNs entirely.

jbms · 2023-11-02T03:53:05Z

Writing the PR using this language

where the first (most significant) bit is the sign bit, followed by the exponent bits, followed by the mantissa bits

and had another question - different languages may default to different NaN values when using their respective NaN-creation routines. Are we taking a "NaN" fill to mean that any NaN value is valid, or are we specifying a specific NaN as implied by the example in the "0x..." point? If the former, implementations probably shouldn't ever write "NaN" (opting for the byte string instead) because they don't necessarily know the intention of other readers/writers. The alternative is to disallow specific NaNs entirely.

"NaN" means the specific value as defined in the specification:

"NaN", denoting thenot-a-number (NaN) value where the sign bit is 0 (positive), the most significant bit (MSB) of the mantissa is 1, and all other bits of the mantissa are zero.

(There is a missed space.)

jbms · 2023-11-02T03:55:29Z

Note that an IEEE 754 NaN value is indicated by any sign bit, all 1 exponent bits, and any non-zero mantissa. By specifying the sign and mantissa we fully specify the value.

This accounts for the fact that f32/f64::NAN is not guaranteed to match the byte representation of a NaN as specified in the zarr spec. zarr-developers/zarr-specs#279

LDeakin added a commit to LDeakin/zarrs that referenced this issue Nov 4, 2023

Fix NaN handling

f23c8ab

This accounts for the fact that f32/f64::NAN is not guaranteed to match the byte representation of a NaN as specified in the zarr spec. zarr-developers/zarr-specs#279

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating point fill values' endianness #279

Floating point fill values' endianness #279

clbarnes commented Nov 1, 2023 •

edited

jbms commented Nov 1, 2023

clbarnes commented Nov 1, 2023

jbms commented Nov 1, 2023

clbarnes commented Nov 1, 2023 •

edited

clbarnes commented Nov 1, 2023

jbms commented Nov 2, 2023

jbms commented Nov 2, 2023

Floating point fill values' endianness #279

Floating point fill values' endianness #279

Comments

clbarnes commented Nov 1, 2023 • edited

jbms commented Nov 1, 2023

clbarnes commented Nov 1, 2023

jbms commented Nov 1, 2023

clbarnes commented Nov 1, 2023 • edited

clbarnes commented Nov 1, 2023

jbms commented Nov 2, 2023

jbms commented Nov 2, 2023

clbarnes commented Nov 1, 2023 •

edited

clbarnes commented Nov 1, 2023 •

edited