Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try using a byte array in ArrayHitCounter instead of a short array #613

Open
2 tasks
alexklibisz opened this issue Nov 29, 2023 · 0 comments
Open
2 tasks

Comments

@alexklibisz
Copy link
Owner

Background

ArrayHitCounter uses an array of shorts to count hits. It's not a very memory-efficient implementation, as it requires an array entry for every document in the segment. So it uses shorts because a short requires half the memory of an int, and counts should rarely exceed the max value of a short.

I think an array of bytes would also work, and would require half the memory. This could be implemented as a new implementation of the HitCounter interface: rename the current one to ShortArrayHitCounter and add a new one ByteArrayHitCounter. The max value that fits in a byte is 256. So if the number of hashes passed to MatchHashesAndScoreQuery is <= 256, it uses the ByteArrayHitCounter, else it uses the ShortArrayHitCounter.

Bard already wrote most of it for me:

image image

Deliverables

  • Implement a ByteArrayHitCounter
  • Benchmark it

Related Issues

#611

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant