Persistent storage ? #19

zilveer · 2020-04-28T13:27:21Z

Hi,
Is there any options for persistent storage ?

Regards

simonhf · 2020-05-01T18:09:28Z

Hi @zilveer, Thanks for the question!

There are a couple of options to do this based upon the performance requirements of your application.

Option 1: Highest performance for mostly read applications.

The easiest way would be not to use /dev/shm as the root for the memory mapped files used by SharedHashFile, but to point to an e.g. SSD path instead. However, this comes with some performance problems which may or may not be an issue for your application. The performance issues arise from how the kernel internally implements syncing changed memory map pages in RAM with the pages in the backing store for the memory mapped file. There is no way -- from the userland application -- to predict when or force the kernel to sync a changed memory map page to be synced back to disk. Although this syncing happens in the 'background' it uses resources and can heavily affect performance. But if your application is very read heavy at run-time then this might not even be an issue for you because too little syncing will happen -- due to infrequent database updates -- to cause a performance issue.

Option 2: Highest general performance for any read / write pattern.

Another mechanism is to use the general approach described here [1] which is to use /dev/shm as usual, but have a special process accessing SHF which iterates through the database saving as it goes; effectively making the database persistent in the background. In the background saving process, you can use SHF internals to loop through all the memory mapped files, which can be copied from RAM to RAM while write locked, and then the in-memory clone of the memory mapped file can be persisted to disk or elsewhere completely in the background without affecting database performance. Let's say you creat a clone of the /dev/shm SHF memory mapped files on SSD somewhere. If the box is rebooted, ideally upon start-up you could copy the clone back into /dev/shm, and restart SHF to get the database back.

Option 3: SharedHashFile agnostic solution.

In this option, whenever you update a key in the database, you write the update to a log file. If at a later date, you want to restore the database, then read the log file and 'replay' all the key updates. Using this mechanism you'll probably need some extra biz logic to consolidate log files over time, so upon start-up you grab a copy of the last known version of the database, and replay only a limited set of log files to bring it up to the present.

[1] #18 (comment)

simonhf closed this as completed Dec 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistent storage ? #19

Persistent storage ? #19

zilveer commented Apr 28, 2020

simonhf commented May 1, 2020 •

edited

Persistent storage ? #19

Persistent storage ? #19

Comments

zilveer commented Apr 28, 2020

simonhf commented May 1, 2020 • edited

simonhf commented May 1, 2020 •

edited