Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize data more efficiently #199

Open
kjnilsson opened this issue Oct 21, 2020 · 0 comments
Open

Serialize data more efficiently #199

kjnilsson opened this issue Oct 21, 2020 · 0 comments

Comments

@kjnilsson
Copy link
Contributor

The WAL / disk easily becomes the bottleneck for a RA system, especially on limited cloud-based environments where disks have limits on ops per second and MB per second. Hence it may be worthwhile to try to reduce the size of the data going to disk in the WAL and disks segments.

#186 allows the serialisation function to be pluggable. This was mostly done to try out term_to_iovec/1 instead of term_to_binary/1 - experiments showed minimal or no benefit, mostly due to the hard coded settings inside erlang OTP around how many buffers a vectored write can use as a maximum (64) resulting in excessive syscalls.

So what else can we do? we can try to reduce the "fixed" (ish) overhead of each serialised term. Especially for RabbitMQ there will be multiple occurrances of the same atoms: undefined, basic_message etc which each get serialised as string data in the binary representation. If we could provide a serialisation function that used an atom cache to replace any atoms with an integer index (like the distribution layer does) then we may be able to reduce the fixed disk overhead. How much depends on the workload but it may be enough to have a significant benefit.

The first task would be to write and validate term_to_binary/1 and binary_to_term/1 in pure erlang.

When that is done and we have some idea of the performance hit they can then be extended using an atom cache.

See: https://erlang.org/doc/apps/erts/erl_ext_dist.html for reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants