Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that written data is correctly written #253

Open
JsBergbau opened this issue Sep 7, 2023 · 2 comments
Open

Ensure that written data is correctly written #253

JsBergbau opened this issue Sep 7, 2023 · 2 comments

Comments

@JsBergbau
Copy link
Contributor

JsBergbau commented Sep 7, 2023

Output of rest-server --version

current dev version 07.09.2023

What should rest-server do differently?

There is an option "--no-verify-upload do not verify the integrity of uploaded data. DO NOT enable unless the rest-server runs on a very low-power device"

I think rest-server just calculates the checksum of the blob, if it matches the filename.

However restic should have an option to re-read the file from storage, to ensure, that it got written correctly.

What are you trying to do? What is your use case?

I am using rest-server for my backups and cloning these backups to another HDD with rsync. Doing a restic check --read-data gave a message "Pack ID does not match, want". Luckily on the system rest-server is running the file is fine. So there must be something gone wrong with rsync, which is also quite strange.
Anyway, rest-server should have an option to re-read the file from file-system and ensure that it got written correctly.
For implementing this, please keep the linux file system cache in mind. Since dropping the cache is only possible for all the cache (and requires root rights), which we don't want, I suggest reading the file with to ensure, that it is really written from disc syscall.O_DIRECT.

I also suggest using something like POSIX_FADV_DONTNEED when writing the files, like nocache does https://github.com/Feh/nocache
This improves the genrall perfomance, since it is very unlikely, that the just written blocks will be read again. And if rest-server reads them, we want to read it directly from disk.

One question is how this would degenerate performance, because maybe other blocks have to wait for the OK of the verification process. Maybe somebody with more experience can give an opinion. Maybe this option would be too slow. Nevertheless, we really should think about POSIX_FADV_DONTNEED

Did rest-server help you today? Did it make you happy in any way?

Yes. Rest-server makes my backups crypto trojan safe, which is a very calming feeling.

@wojas
Copy link
Contributor

wojas commented Sep 7, 2023

POSIX_FADV_DONTNEED probably does make sense.

Rereading the data files after write could be useful as an opt-in feature. We could even try to write again once if it fails if the request data is still in memory and we are not currently streaming it to disk. I am not sure if this would actually guarantee that it is really read from disk.

I am not sure about the consequences of using O_DIRECT.

@JsBergbau
Copy link
Contributor Author

We could even try to write again once if it fails if the request data is still in memory and we are not currently streaming it to disk.
Are you sure about this? Personally I would prefer to abort with an error something like "check hardware".

I am not sure about the consequences of using O_DIRECT.

My first intention was to use this to write the file, to ensure there is no cache, but i makes writing very slow https://stackoverflow.com/questions/72539027/why-is-writing-files-witht-syscall-o-direct-flag-make-writing-files-slower-in-go

I have no idea how O_DIRECT would change the time needed for reading the file, so this is something that needs testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants