New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutable RangeBitmap appender #688
Comments
For the case when the range is 0 - Long.MAX_VALUE, the memory overhead should be around 2Mb which looks big but may be acceptable for most of the cases. |
@richardstartin Do you want to take this one? |
hi @bastrich - there's a couple of things to consider
Despite the complexity, several users have now requested mutability in various issues, so I think it's worth implementing. Point 1 above warrants a separate implementation (leaving the existing code alone means users can't be affected). Point 2 above requires a different in-memory layout using At some point in the future we could consider bringing the two implementations under the same interface (which would probably end up being a breaking change because the best name for that interface might be Do you want to have a go at implementing |
Hi @richardstartin, thank you for the detailed response. I didn't think of the solution you proposed but I will check it, thank you! I can create 2 PRs so we can look at both options and compare them. Are there any expected timelines? |
Hi @richardstartin, sorry for my original post, I researched the source code a bit more and can see that my proposal doesn't work and doesn't make any sense (or at least is not full). Indeed, I have to work with containers and their layout. Will try to research more and implement some working solution. |
Is your feature request related to a problem? Please describe.
I have the case, when I need to update RangeBitmap on the fly, but it is immutable, so I have to rebuilt it from scratch every time.
Describe the solution you'd like
I researched the source code, conducted some experiments a bit, and found that implementing a mutable version of RangeBitmap.Appender and adding the 'deserialize' method for it can be a good solution here. For the mutable version of the Appender we can add a method with signature like
public void replace(index: int, prevValue: long, newValue: long)
. The implementation can be based on the changing values in the existing maskBuffer or mask (depending on the current key and rid values). In order to calculate new mask state in the correct way we should store an array of char with the size equal to the maskBuffer size, let's call it maskTrackingBuffer (and the corresponding variable maskTracking). And this new array should contain number of '1' for the corresponding position in maskBuffer. Then, the 'replace' algorithm can be similar to the following pseudocode:Of course, there will be memory overhead for appender, I estimate it like (maskBuffer size) * 16 bits. But for relatively small ranges like, for example, 0-200_000 it should be almost unnoticeable. And anyway there will be choice between immutable and mutable Appender.
I tried to find the solution with less memory consumption but couldn't figure out something more optimal.
Does it sound ok? Is it ok if I implement this logic and create a pull request?
The text was updated successfully, but these errors were encountered: