optimized (smaller) lookup table for float (binary32 only) #99

jrahlf · 2021-08-29T14:44:53Z

Have you considered optimizing the code size for parsing floats?
The LUT power_of_five_128 has approximately ~1400 entries which are needed for parsing doubles.
I don't know how many entries are required for parsing a float, but I suspect the LUT could be a lot smaller in that case.

If there was a separate LUT for parsing floats, the compiled binary size could be reduced significantly.

The text was updated successfully, but these errors were encountered:

lemire · 2021-08-29T14:52:31Z

Pull requests invited!

lemire · 2021-08-30T17:42:09Z

To be clear here, if I understand correctly, @jrahlf wants an implementation that supports only binary32 numbers (float). Squeezing the table is easy, one can simply follow through the paper at https://arxiv.org/abs/2101.11408

Of course, the net result will only support binary32 numbers.

Alexhuszagh · 2021-08-30T17:50:35Z

My mistake then, and I just confirmed that this would have off-by-1 values, which would mess up the logic.

jrahlf · 2021-09-05T16:53:03Z

fast_float/include/fast_float/fast_table.h

Line 34 in 8c4405e

    
           constexpr static int smallest_power_of_five = binary_format<double>::smallest_power_of_ten();

If you change these to float, the table size shrinks from 1302 to 208, i.e. you can save approximately 8kB.
So one could add another table power_of_five_128 for float and then let the templatized code use the correct table.

There is one catch: If you used both double and float, the code size would be greater (worse) than when only providing the double table. Two possible solutions:
a) compile time option when the user only wants to parse floats, then from_chars<double> is disabled
b) clever data packing so that only the float part of the table gets compiled into the binary, if only from_chars<float> is used. I am assuming here that the float table is a sub range of the double table. Is this correct, @lemire ?

lemire · 2021-09-05T20:44:14Z

I am assuming here that the float table is a sub range of the double table.

Yes, it is.

jrahlf · 2021-09-12T13:50:01Z

So I got a proof of concept: #103
I added the files: example_test_float.cpp and example_test_mixed.cpp.
With the HEAD version, the file sizes are as follows (Ubuntu gcc9.3):

34072 Sep 12 14:43 tests/example_test
34072 Sep 12 14:43 tests/example_test_float
42656 Sep 12 14:43 tests/example_test_mixed  <-- not ideal

With the separate float LUT the sizes are:

34072 Sep 12 14:47 tests/example_test
25880 Sep 12 14:47 tests/example_test_float   <-- saves 8kB as expected
42744 Sep 12 14:47 tests/example_test_mixed

There are two notable things:

The extra float LUT only increases the mixed file size size by 100 Bytes, that is unexpected (in a good way). I expected an increase by 208 * 8Bytes = 1.6kB.
The mixed file size is 8k larger than the double file. Heavy inlining might not be ideal for the mixed case (regarding code size). E.g. readelf shows that fast_float::parse_long_mantissa has a code size of 4kB and is instantiated for both float and double.

I would prefer to to make the double LUT a composite of the float LUT and additional data, but reading a composite object as one linear array would violate C++ aliasing rules. :(
However, this might be solvable with std::bit_cast ...

Overall it might makes sense to always use either double or float and not mix the types when parsing numbers.

lemire changed the title ~~optimized (smaller) lookup table for float~~ optimized (smaller) lookup table for float (binary32 only) Aug 30, 2021

jrahlf mentioned this issue Sep 12, 2021

Add optimized (smaller) lookup table for float type parsing #103

Open

tiehuis mentioned this issue May 1, 2022

lookup tables and binary size difference tiehuis/zig-parsefloat#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimized (smaller) lookup table for float (binary32 only) #99

optimized (smaller) lookup table for float (binary32 only) #99

jrahlf commented Aug 29, 2021

lemire commented Aug 29, 2021

lemire commented Aug 30, 2021

Alexhuszagh commented Aug 30, 2021

jrahlf commented Sep 5, 2021

lemire commented Sep 5, 2021

jrahlf commented Sep 12, 2021 •

edited

optimized (smaller) lookup table for float (binary32 only) #99

optimized (smaller) lookup table for float (binary32 only) #99

Comments

jrahlf commented Aug 29, 2021

lemire commented Aug 29, 2021

lemire commented Aug 30, 2021

Alexhuszagh commented Aug 30, 2021

jrahlf commented Sep 5, 2021

lemire commented Sep 5, 2021

jrahlf commented Sep 12, 2021 • edited

jrahlf commented Sep 12, 2021 •

edited