Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapper default config = me out of disk space #794

Open
wallentx opened this issue Mar 11, 2024 · 6 comments
Open

Snapper default config = me out of disk space #794

wallentx opened this issue Mar 11, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@wallentx
Copy link
Contributor

wallentx commented Mar 11, 2024

Just typing this up briefly, and can circle back and provide more info, but it seems that my rootfs ran out of space due to the behavior of snapper, and me being unfamiliar with it. This is actually wonderful that this happened to me, because now I know what was taking up all my space on my steam deck entirely unrelated to an issue I have on my steam deck.

I'm not sure what needs to be tweaked, but it seems that the rate at which I'm triggering snapshots to happen (by compiling various packages?) resulted in about 360GB of snapshots.

I went here, as per the arch wiki - http://snapper.io/2016/05/18/space-aware-cleanup.html and it seems like qgroups weren't enabled by default (maybe intentional), but doing those steps freed up 6.1gb.

I need to familiarize myself with this and figure out what an optimal snapshot config is

@LukeShortCloud LukeShortCloud added the bug Something isn't working label Mar 11, 2024
@LukeShortCloud
Copy link
Owner

Hey @wallentx , thanks for bringing this to my attention! I have historically had troubles fine-tuning the Snapper configuration. I thought I finally got it working but maybe not given your situation.

The expected behavior is to keep:

  • 10 hourly snapshots
  • 12 monthly snapshots
  • 1 yearly snapshot

We use the same configuration for Snapper for both the / root and /home directories. A snapshot is also taken whenever pacman is used (there is both a pre- and post-installation snapshot).

Or perhaps it is working as expected and that our configuration is overkill? Having 360 GB of snapshots does seem excessive.

Here are some hints about how this is implemented in winesapOS to hopefully help you:

@LukeShortCloud
Copy link
Owner

LukeShortCloud commented Mar 11, 2024

Somewhat related: when Bcachefs stablizes over the next year or two, I was considering switching the winesapOS root file system from Btrfs to Bcachefs for new installations.

The biggest new feature that it adds that I am interested in is the use of a cache drive. A user could, in theory, run the OS off of a slow flash drive but use a local cache file on a mounted internal NVMe drive. I do something similar with OpenZFS today. As much as I love OpenZFS, because it's an out-of-tree kernel module, it scares me to use it for the root file system. It would be common/easy for an upgrade to break the ZFS DKMS module. Bcachefs, on the other hand, is built into the Linux kernel and will always be availabe.

Snapper recently got support for Bcachefs snapshots but still needs to find a way to handle recovery: https://github.com/openSUSE/snapper/issues/858

@LukeShortCloud
Copy link
Owner

I think the problem is that winesapOS is simply configured to take too many backups. Even the 1 yearly backup could be an issue. If you have old files you deleted less than 2 years ago, they would still remain on the system. The NUMBER_LIMIT is also set to 50 in our configuration. We can lower that.

@wallentx Disabling the Snapper timeilne would likely help you short-term. There is still snap-pac installed which takes backups whenever you install or remove a Pacman package.

$ sudo systemctl disable --now snapper-timeline.timer snapper-cleanup-hourly.timer

I am open to suggestions for how many hourly, daily, weekly, and monthly backups we take/keep (along with the NUMBER_LIMIT to define the maximum total number). The yearly backup seems excessive in practice.

You also originally asked about quota groups in Snapper. We do not have that configured in winesapOS. Feel free to open a PR if you would like. It sounds promising!

@wallentx
Copy link
Contributor Author

wallentx commented Apr 6, 2024

I was just thinking about this, and saw that you commented 1 minute ago 👀

I aim to look into a proper config for using these snapper quota groups, but I think I killed the drive that WinesapOS is installed on.

I initially installed it on an external nvme enclosure, with the m.2 2230 nvme that came with my steam deck, and either the drive is flaky, or the nvme is flaky, because I've always noticed that the OS would sometimes hang during heavy writes/reads. Now its behaving in a way where it locks up immediately after logging in. There's a chance that something related to one of the KDE packages that got updated is causing the lockups.. but I haven't really tried to debug it, and I'll ultimately need to just install this on a different external drive.

At any rate, once I get stuff migrated over, I'll copy the config I had set up and PR it.

@LukeShortCloud
Copy link
Owner

Dang, sorry to hear about your drive failing! I've heard of that happening with flash drives since they use cheaper NAND flash chips have much more limited writes. Not great for an OS. A NVMe driving failing is definitely more rare.

I upgraded my external setup to a USB-C 3.2 enclosure in addition to switching to DRAM drives (there are great prices for used ones on eBay). That helped minimize my problems. Internal drives will always have the best throughput, though. Newer versions of winesapOS provide ZRAM as an alternative to swap. That could possibly help as well.

Anyways, thanks for the update! I hope you can get everything fully recovered. I've had a bad time trying to optimize Snapper in the past so I appreciate the help!

@LukeShortCloud
Copy link
Owner

Maybe I'm overthinking all of this.

What if we disable time-based backups? We already have snap-pac which takes a pre- and post- un/install snapshot when using Pacman.

Or go a step further and only manually create a one pre- and one post-upgrade snapshot when running a winesapOS Upgrade.

I try to follow the 3-2-1 rule of backups myself so having local on-system backups is selfishly not as important to me. I'm not sure how useful it is for our end-users. Most people randomly run out of disk space for seemingly no reason due to this. I only hear about Btrfs snapshots when people complain about 100% disk usage. 😅

Definitely open to thoughts here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants