Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

What is the actual storage consumption for KV pairs in pmemkv-java? #99

Open
ch2994 opened this issue Dec 3, 2020 · 5 comments
Open

Comments

@ch2994
Copy link

ch2994 commented Dec 3, 2020

Hello,

I have simple modified MixedExample.java that tries to put four 4MB Java ByteBuffer into a 32MB vcmap engine. The code compiles but to my surprise it throws out of memory exception when it tries to put the 4th ByteBufffer.

It looks like when put into pmemkv-java, the key-value pair consumes more space then it appears. Can you confirm if this is the case? If so, is there a way we can calculate how much storage is needed before we store the data? And is there an API to get the available storage from an existing pmemkv-java database?

Thanks

MixedExample.txt

@karczex
Copy link

karczex commented Dec 4, 2020

For very small pools (like 32MB) you may see very big (relatively to pool size) overhead in memory footprint, due to data structure meta-data (which may vary between engines) ,pool meta-data, and data fragmentation.
In general, every data structure (also for dram) consumes some additional space for meta data.

@karczex
Copy link

karczex commented Dec 9, 2020

We should add test for storage overhead.

@ch2994
Copy link
Author

ch2994 commented Dec 13, 2020

Thank you Karczex. I understand that there will be overhead with any kv store but what we have been observing is that for a 32MB pmemkv-java database we can only store 8MB worth of data (sometimes we can store 12MB, I don't know if there is another issue here).

So is there a way we can find out for a given size of data, how much storage is actually used by pmemkv-java?

Thanks

@igchor
Copy link
Contributor

igchor commented Dec 15, 2020

Unfortunately there is no easy way to do this for now. We have some ideas, see: pmem/pmemkv#671 but it's not implemented yet.

@lukaszstolarczuk
Copy link
Member

We've implemented a very simplistic way to check the approx. storage consumption of pmemkv.
If you're still interested, pls see this example: https://github.com/pmem/pmemkv/blob/master/examples/pmemkv_fill_cpp/pmemkv_fill.cpp

Once run, you can estimate the records count, which will fit into database with specific engine and key/value sizes. That's all approximate, but it may give you some answers.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants