Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RoaringBitmap.andNot does not produce optimal containers #512

Open
richardstartin opened this issue Aug 21, 2021 · 0 comments
Open

RoaringBitmap.andNot does not produce optimal containers #512

richardstartin opened this issue Aug 21, 2021 · 0 comments

Comments

@richardstartin
Copy link
Member

richardstartin commented Aug 21, 2021

Performing andNot on a large contiguous bitmap and a sparse bitmap does not produce run optimised bitmaps when it could, which inflates size:

    RoaringBitmap existence = new RoaringBitmap();
    existence.add(0L, 1_000_000L);
    printContainerStats(existence);
    RoaringBitmap sparse = new RoaringBitmap();
    for (int i = 0; i < 8000; ++i) {
      sparse.add(i * 125);
    }
    existence.andNot(sparse);
    printContainerStats(existence);
    existence.runOptimize();
    printContainerStats(existence);

This prints:

containers: 16x RunContainer, serialized size: 230, size:168
containers: 16x BitmapContainer, serialized size: 131208, size:131112
containers: 16x RunContainer, serialized size: 32226, size:32164

However, this leads to a seemingly spurious space/time tradeoff, because run optimisation slows subsequent intersections down, so I expect intersections with run containers could be improved:

@State(Scope.Benchmark)
public class AndNotBenchmark {

  @Param("1000000")
  private long size;

  @Param({"true", "false"})
  boolean runOptimise;

  private RoaringBitmap contiguous;
  private RoaringBitmap sparse;
  private RoaringBitmap combined;
  private RoaringBitmap otherSparse;

  @Setup(Level.Trial)
  public void setup() {
    contiguous = new RoaringBitmap();
    contiguous.add(0L, size);
    sparse = new RoaringBitmap();
    for (int i = 0; i < size; i += 125) {
      sparse.add(i);
    }
    combined = contiguous.clone();
    combined.andNot(sparse);
    otherSparse = new RoaringBitmap();
    for (int i = 1; i < size; i += 125) {
      otherSparse.add(i);
    }
    if (runOptimise) {
      combined.runOptimize();
    }
  }

  @Benchmark
  public RoaringBitmap andSparse() {
    return RoaringBitmap.and(combined, otherSparse);
  }

  @Benchmark
  public RoaringBitmap andContiguous() {
    return RoaringBitmap.and(combined, contiguous);
  }
}
Benchmark                      (runOptimise)   (size)  Mode  Cnt   Score   Error  Units
AndNotBenchmark.andContiguous           true  1000000  avgt    5  57.054 ± 0.030  us/op
AndNotBenchmark.andContiguous          false  1000000  avgt    5  20.811 ± 0.065  us/op
AndNotBenchmark.andSparse               true  1000000  avgt    5  39.442 ± 0.408  us/op
AndNotBenchmark.andSparse              false  1000000  avgt    5  20.641 ± 0.280  us/op
@richardstartin richardstartin added bug and removed bug labels Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant