Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: expose consecutive runs #274

Open
ranfdev opened this issue May 9, 2024 · 3 comments
Open

Feature: expose consecutive runs #274

ranfdev opened this issue May 9, 2024 · 3 comments
Labels
proposal A feature proposal

Comments

@ranfdev
Copy link

ranfdev commented May 9, 2024

roaring.consecutive(bit: 0, range: 13..) could return the number of consecutive zeroes from bit 13.

@Kerollmops Kerollmops added the proposal A feature proposal label May 28, 2024
@Kerollmops
Copy link
Member

Hey @ranfdev 👋

Roaring bitmaps are not necessarily "bitmaps" but are optimized to not be too big for some cases. What you seem to want is to know the number of missing values in a certain range. You can probably use the range_cardinality(&self, range: impl Range) -> u64 function and substract this number from the range size itself.

@ranfdev
Copy link
Author

ranfdev commented May 28, 2024

Thanks, but unfortunately range_cardinality doesn't cover my use case.

That's because range_cardinality returns the count of ones or zeros even if they aren't consecutive.

The function Roaring::consecutive(&self, bit, range) would stop counting after finding the first bitflip.

I know roaring bitmaps may be stored compressed with run length encoding. In fact, consecutive could take advantage of this compression to execute faster.

If the roaring bitmap contains RLE compressed ones as (start, count), and the user asks for consecutive(1, start..) the query can return instantly with value count.

I think this should be implemented inside the library, because as a user I don't have access to the internal representation of the bitmap, so I cannot optimize the query.

@Kerollmops
Copy link
Member

That's because range_cardinality returns the count of ones or zeros even if they aren't consecutive.

Ho! I see. That's indeed not the same.

I think this should be implemented inside the library, because as a user I don't have access to the internal representation of the bitmap, so I cannot optimize the query.

For now you could try implementing this outside of the library by using the iterator. But happy to review a PR if you think performances matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal A feature proposal
Projects
None yet
Development

No branches or pull requests

2 participants