Scalability #2

cinghiopinghio · 2018-12-18T21:44:21Z

This is a great idea, thanks for sharing.
I'm mostly worried about the size of new-entries which, according to my understanding, is always growing, containing the whole history of the database.
Is there a plan to introduce some mitigation of this?

The text was updated successfully, but these errors were encountered:

39aldo39 · 2018-12-19T16:53:06Z

Indeed, new-entries contains the whole history of the database. However, since the number of read bytes are stored, larger files shouldn't slow it down too much. And by using .decsync-sequence files, unchanged directories are skipped. But it does always grow.

I don't have any concrete plan to mitigate this. But a way to do it is to delete old entries. This can be entries which are read by everyone, overwritten by a new value, or have aged (for example, very old RSS articles statuses are irrelevant). We then have to store the datetime instead of the number of bytes, but that makes skipping files less efficient. Maybe there is a way to still use bytes?

Backfighter · 2019-11-19T10:39:31Z

The size impact could be reduced by only storing diffs inside new-entries. This means updating entries in stored-entries is simply applying a diff to the existing entry.

Since all values of the store are most likely text based, diffs should work well.

Backfighter · 2019-11-19T10:47:42Z

In terms of deleting entries you would only delete entries at the top of the file and store how many bytes have been deleted. When other apps read the files they simply substract this number from the stored read-bytes and start at this new offset.

Obviously only entries can be deleted that have already been read by all other apps otherwise information is lost.

The only caveat is that this is incompatible with diffs.

39aldo39 · 2019-11-27T22:23:18Z

I have thought a bit more about reducing the size of the files. Note that I have introduced versioning for DecSync, so incompatible changes are now possible. A few ideas:

read-bytes can be removed. Skipping some content of a file is not that much faster, and being able to remove values is definitely worth it. Its main advantage was skipping whole files, but using .decsync-sequence files (or similar) also achieves this.
new-entries can be removed as well. We only need to store stored-entries and just read the whole file when it changes. Multiple occurrences should only be executed once.
To reduce the size of the values (e.g. calendar events), we might even use references inside stored-entries that point to the original app that introduced the value. This does not introduce problems when that original value is removed, as that means it is either updated or has aged.

39aldo39 · 2020-08-04T20:25:33Z

I have now actually released a draft of version 2!

The main motivation is that Android is moving to SAF, which is very, very slow. The main slowdown is in having a lot of different files inside a lot of directories. This is mostly solved by making the structure a lot flatter. In addition, it is simplified using the first 2 points given above, the entries are duplicated less and it is easier to remove old entries.

Backfighter · 2021-02-01T21:35:18Z

The 1 byte hash systems seems a little to limiting in my opinion. With only 256 possible path hashes wouldn't there be a high probability of collisions? Let's say you have only 20 paths, then the probability of at least one collision should already be at arround 50%. Have I overlooked something here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalability #2

Scalability #2

cinghiopinghio commented Dec 18, 2018

39aldo39 commented Dec 19, 2018

Backfighter commented Nov 19, 2019 •

edited

Backfighter commented Nov 19, 2019 •

edited

39aldo39 commented Nov 27, 2019

39aldo39 commented Aug 4, 2020

Backfighter commented Feb 1, 2021

Scalability #2

Scalability #2

Comments

cinghiopinghio commented Dec 18, 2018

39aldo39 commented Dec 19, 2018

Backfighter commented Nov 19, 2019 • edited

Backfighter commented Nov 19, 2019 • edited

39aldo39 commented Nov 27, 2019

39aldo39 commented Aug 4, 2020

Backfighter commented Feb 1, 2021

Backfighter commented Nov 19, 2019 •

edited

Backfighter commented Nov 19, 2019 •

edited