Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce latency of list operations in snapshot browser #228

Open
modem opened this issue Apr 17, 2024 · 7 comments
Open

reduce latency of list operations in snapshot browser #228

modem opened this issue Apr 17, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@modem
Copy link

modem commented Apr 17, 2024

I would like to propose a few improvements for the file restore:

  1. Improve Browse and Restore Files in Backup performance
    When opening this browse and restore area, it can take quite a while for the information to show in the backrest interface. Tried the following:
  • with 1 17GB backup, it takes 4s every time we open any of the folded items.
  • with a 6+TB repo, it takes around 20s every time we open any of the folded items.
    It does not seem long, but the problem is if we want to restore a file or folder that is deep inside the folder structure, these times will add up. For example, to restore a file/folder that is 5 folders deep, we need to open 6 folded items until we see the file/folder, each taking around 20s (for the later example). The more complicated folder structure the more time we wait for the Backrest interface to show the desired items/folders to restore. It would be nice if we could get a faster structure update, maybe caching the contents, even if we have to wait a bit longer the 1st time we open the backup and restore view to get the full list of items, so every new folder opening would be just loading the already collected information on the interface.
  1. Allow to restore several items at the same time. We currently can only restore 1 file or folder at the time, if we want to restore multiple files/folders we need to wait until the current restore to finish until we can start a new one.
@modem modem added the enhancement New feature or request label Apr 17, 2024
@garethgeorge
Copy link
Owner

Hey, I'm curious how long restic ls <prefix> takes in your repos as that's the underlying command backrest is using for directory iteration.

The alternative approach that Backrest could possibly take would be a long pause on opening the file browser to index all paths (e.g. in a trie in the database) before permitting browsing but slow operations with more "immediate" startup is the tradeoff I've made for now.

If you need to do a lot of operations, restic supports restic mount (but mount command support isn't something that'll be added to backrest due to the underlying fuse dependency).

Allow to restore several items at the same time. We currently can only restore 1 file or folder at the time, if we want to restore multiple files/folders we need to wait until the current restore to finish until we can start a new one.

Mind opening another bug scoped to this issue? It helps to keep discussion somewhat scoped -- bulleted bugs are difficult to address wholesale.

Is the main issue that the snapshot browser closes itself after you select a file to restore?

@modem
Copy link
Author

modem commented Apr 18, 2024

Hey, I'm curious how long restic ls <prefix> takes in your repos as that's the underlying command backrest is using for directory iteration.

Tried in the biggest repo mentioned above (6+TB), it took 25s to get the list all the files and folders (6735 items).
time /bin/restic-0.16.4 ls 759b91f9 -r /repos/BackupMain/

image
This repo has several plans, I tested with the last snapshot of the biggest plan.
The difference is around +5s from the command line to open just 1 branch on the user interface.

Mind opening another bug scoped to this issue? It helps to keep discussion somewhat scoped -- bulleted bugs are difficult to address wholesale.

Will do.

Is the main issue that the snapshot browser closes itself after you select a file to restore?

No, but I find very difficult to select anything on this popup menu as it disappears very quickly when moving the mouse from the folder to the menu:
image

@garethgeorge
Copy link
Owner

garethgeorge commented Apr 19, 2024

Hmm, 6735 items is actually quite a small file count (I'm guessing a small number of large files). I'll benchmark how long taking a listing of a full repo with a few hundred thousand files takes this weekend to get a sense of how much time indexing the whole repo takes. If it's not too bad, indexing upfront may be a reasonable way to go here.

Just to check -- it looks like you're using a repo on a local HDD? I wouldn't expect this to take 20 seconds, it seems like something strange is going on there. It typically takes ~10 seconds for me using a remote target (backblaze storage) in a 1TB repo.

Reading https://github.com/restic/restic/blob/228b35f074ddf4dec6ce1aea51ccfc2c413d0a01/cmd/restic/cmd_ls.go#L259-L411 I think restic is doing an optimized traversal of the file tree in the data structure to avoid reading subdirectories until they're requested so there's definitely some tradeoff to be had here in terms of how much we prefetch.

@garethgeorge garethgeorge changed the title File restore improvements reduce latency of list operations in snapshot browser Apr 19, 2024
@garethgeorge
Copy link
Owner

garethgeorge commented Apr 19, 2024

Ran some benchmarks on a repo in backblaze (B2) so network fetch time is included in this test

restic ls on all files in a repo of 231819 files took:

~/.local/share/backrest/restic-0.16.4 ls latest  8.01s user 0.91s system 20% cpu 44.355 total

restic ls / (e.g. only the top level folder) took

snapshot d49e70a6 of [/tank_fast] filtered by [/] at 2024-04-18 18:30:05.494168399 -0700 PDT):
/tank_fast
~/.local/share/backrest/restic-0.16.4 ls latest /  5.75s user 0.58s system 32% cpu 19.501 total

following up with restic ls /subdir (maintaining the cache) I found that once the cache for a snapshot is hot indexing is listings were pretty fast:

~/.local/share/backrest/restic-0.16.4 ls latest /subdir/  6.42s user 0.58s system 154% cpu 4.532 total

I reset the cache before the first and second operations by running

export XDG_CACHE_HOME=$(mktemp -d)

tl;dr hard to say what the right tradeoff is here but I'm thinking snapshot browsing is in a pretty acceptable range. There may be something going on on your system that's degrading snapshot listing speed. Are you using a local storage repo / is your disk under heavy load? Any other factors that might contribute to the slowdown?

@modem
Copy link
Author

modem commented Apr 21, 2024

I'm using a an externally HDD connected through USB. This disk is only used for my backups, so most accesses are through backrest.
When I tested, backrest was not performing any backup neither restore, so the disk should have been in idle.
I see your listing is faster than mine (my backed up files are big files), but I wonder how long it took on your backrest interface.

@garethgeorge
Copy link
Owner

Hey, it's also taking on the order of 10 seconds with B2 as a remote on my interface or on the order of 2-5 seconds when using a local repo on an SSD.

Is the device you're running Backrest on memory constrained? I wonder if the fork'd restic processes are starting slowly or hitting memory pressure as starting restic for each list operation can be an expensive.

At a high level, I think I'll probably aim to keep list the way it is now as the implementation is very simple and I think works well enough on most devices (<10 seconds is acceptable latency IMO as restores are uncommon), but we can look into debugging why we're seeing such slow listings on your installation.

@modem
Copy link
Author

modem commented Apr 25, 2024

I'm using in docker container in a QNAP NAS.
I see in Portainer, it uses a lot of CPU and memory when running the restic ls command (see the spikes in the charts):
image

Restarting the container clear the allocated memory, but does not impact the time to run the restic ls command.

Anyway, the time it takes to run ls on the root:
time /bin/restic-0.16.4 ls --json e32b25f6c22bfeb0eee8d4a6883eb25820cfcdc8d56e7f785ba9daeb14dbf590 /raid/Popcorn/ -o sftp.args=-oBatchMode=yes
image

is more or less the same it takes to read the entire backup contents:
time /bin/restic-0.16.4 ls --json e32b25f6c22bfeb0eee8d4a6883eb25820cfcdc8d56e7f785ba9daeb14dbf590 -o sftp.args=-oBatchMode=yes
image

So personally, I see a possible improvement when loading the entire backup contents, rather than just folder by folder. But of course there will be impact in the logic afterwards to treat the JSON results... But it could be done only once.

I have 6 plans performing a backup to this repo, not sure it results in any negative impact. The 6735 items mentioned above are related to just 1 plan, not the entire repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants