Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement from using asynchronous --background|--bg for balance operation? #40

Open
sten0 opened this issue Nov 9, 2017 · 9 comments
Labels

Comments

@sten0
Copy link
Contributor

sten0 commented Nov 9, 2017

I recently read about btrfs-balance's --background|--bg option. Would desktop users of btrfsmaintenance benefit from increased system responsiveness due to its asynchronous nature? I wonder because asynchronous fs operations don't block the gui like synchronous ones tend to. Or does an asynchronous balance introduce the possibility of filesystem inconsistency if an unexpected interruption occurs?

@barak
Copy link

barak commented Sep 3, 2018

All the background flag does is detach the balance process. It's for use in a terminal, where you want to initiate a balance operation, and have your shell return right away, and monitor it with btrfs balance status /filesystem. It shouldn't really make any difference to the speed, just to where the chatter goes. Here, we want the chatter so it can be logged, so this wouldn't make sense.

@kdave kdave added the question label Sep 25, 2018
@ghost
Copy link

ghost commented Jul 10, 2020

Would wrapping maintenance tasks in cgroups or nice/ionice have any effect? Much of the cpu usage is kernel threads but some is in btrfs user space too.

@barak
Copy link

barak commented Jul 10, 2020

That sounds like a good idea to me. This stuff should all have extremely low priority for both CPU and IO, and there's no harm making double-sure of that.

@sten0
Copy link
Contributor Author

sten0 commented Jul 11, 2020 via email

@ghost
Copy link

ghost commented Jul 11, 2020

I am not sure this is placebo at all. kdave correctly identifies one problem area with the use of mq-deadline vs cfq/bfq. This shows it is at least possible.

Ultimately, perhaps the easiest option is to change the IO scheduler.

Mind that my comment #40 (comment) was about user space CPU usage. On a low-end system this can be considerable and affect responsiveness of the overall system. Nice/cgroups does help here. This is my own findings based on testing on a slow NAS box.

@barak
Copy link

barak commented Jul 13, 2020

This is a pretty serious problem. Right now, what happens often in practice on a laptop which is suspended-unless-in-active-use is that you open your laptop to give a lecture first thing Monday morning, and it's been suspended for four days and if it would have wanted to run a scrub at 04:00 Sunday morning, so instead shazam, your 09:00 Monday morning lecture or VoIP meeting or whatever is ruined.

There are two questions. One is how to get a scrub to be less annoying when it does run, the other is how to not do a scrub at a bad time. All around, scrub shouldn't bring the machine to its knees. One way to accomplish that would be to tweak the priority etc of scrub itself to be much less aggressive. That's a kernel-level thing that people are working on, but apparently it's really hard for technical reasons I don't fully understand.

Another would be to monitor if there is any other activity, and if so btrfs scrub cancel the scrub and btrfs scrub resume it later. But in any case we should not be starting the scrub under circumstances when one can be pretty confident it will be super annoying, like when a human being just resumed their laptop, or when they're typing on the keyboard or using the mouse or running some interactive process.

This is sort of a systemd-level issue, since it's systemd's job to monitor what's going on and enforce policies to make the computer work properly. Systemd is in a good position to know that some service (like a scrub) can be run anytime, and to know whether the computer is in actual interactive use, and to know how to pause a service until a better time. Would it make sense to open a dialog with the systemd people about this? They must have thought about this sort of thing before, because the exact same issue comes up for periodic backups and many other sorts of routine maintenance tasks.

@ghost
Copy link

ghost commented Jul 13, 2020

There are two questions. One is how to get a scrub to be less annoying when it does run, the other is how to not do a scrub at a bad time.
...
This is sort of a systemd-level issue, since it's systemd's job to monitor what's going on and enforce policies to make the computer work properly.

I am pretty sure you can set systemd to not run missed scheduled jobs , and also to monitor battery or other usage.

@barak
Copy link

barak commented Jul 13, 2020

We should put that stuff in the unit file then, at least as much as is currently expressible, with a comment explaining what else would be nice.

I don't think just not running it if its time was missed is right. It still needs to be run. Just maybe not right away, maybe wait until the machine is quiescent for a while. And instead of "every other Monday 02:30" maybe we could say "every two weeks of total uptime, or six weeks of wall-clock time plus at least three days of uptime, whichever comes first; and only when the machine has been up but quiescent for at least 30 min". Because if you just turn it on for one hours every week, you really don't want a scrub every other time you open it. So there has to be something that takes account of both uptime and real time.

@ghost
Copy link

ghost commented Jul 13, 2020

All good suggestions. I don't use systemd, but based on the docs there is a OnActiveSec option which might be useful.

Perhaps you can test and then propose new settings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants