Implement roaring_bitmap_internal_validate #658

lemire · 2023-09-26T20:09:17Z

When deserializing a bitmap, it is possible that the result might be invalid. This could happen because there was data corruption. The deserialization could still generate a bitmap without failure, but the result could be otherwise unusable.

You can avoid such problems by hashing your saved data (e.g., md5sum). But we could could also directly, at some expense, validate the deserialized data.

The C version of Roaring has an interesting function that can be called after deserializing a bitmap, to make sure it is proper:

https://github.com/RoaringBitmap/CRoaring/blob/a103d3811702b9389c538881c9974e9a7a7552af/src/roaring.c#L435

     roaring_bitmap_t *t = roaring_bitmap_portable_deserialize_safe(serializedbytes, expectedsize);
     if(t == NULL) { return EXIT_FAILURE; }
     const char *reason = NULL;
     if (!roaring_bitmap_internal_validate(t, &reason)) {
         return EXIT_FAILURE;
     }

It is not very difficult to implement and could help users who have production data.

xtonik · 2023-11-06T19:32:24Z

I have written the validation according to C version of Roaring bitmaps. Some checks were removed, e.g. it is impossible to have negative capacity as it is derived from array length, which is always positive.

One TODO left - is it somehow possible to have run container with nbrruns equal to zero?

Do you have some suggestion how to create broken serialized bitmap to make some tests?

lemire · 2023-11-09T10:31:24Z

Important work!

is it somehow possible to have run container with nbrruns equal to zero

It should not be. We don’t include empty containers.

lemire added enhancement help wanted labels Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement roaring_bitmap_internal_validate #658

Implement roaring_bitmap_internal_validate #658

lemire commented Sep 26, 2023

xtonik commented Nov 6, 2023

lemire commented Nov 9, 2023

Implement roaring_bitmap_internal_validate #658

Implement roaring_bitmap_internal_validate #658

Comments

lemire commented Sep 26, 2023

xtonik commented Nov 6, 2023

lemire commented Nov 9, 2023