Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not exit with non-zero exit code when using --alert flag and Failed to open database "/var/lib/vnstat/vnstat.db" in read-only mode. throws #259

Open
n0099 opened this issue Apr 13, 2024 · 5 comments

Comments

@n0099
Copy link

n0099 commented Apr 13, 2024

I'm using this plain bash as a fuse of monthly data limit:

#!/bin/bash
#set -x
set -e

if ! vnstat --alert 2 3 m rx 500 GiB -i eth0; then
    iptables -A OUTPUT -p udp --sport 443 -j DROP
    iptables -A OUTPUT -p tcp --sport 443 -j DROP
    iptables -A OUTPUT -p tcp --sport 80 -j DROP
    exit 1
fi

but when the system is under high load, vnstat may rarely encountering #129

Apr 13 08:10:00 azure systemd[1]: Starting n0099-vnstat-alert.service...
Apr 13 08:10:05 azure vnstat-alert.sh[113965]: Error: Failed to get info value for "dbversion" from database (5): database is locked
Apr 13 08:10:05 azure vnstat-alert.sh[113965]: Error: Failed to open database "/var/lib/vnstat/vnstat.db" in read-only mode.
Apr 13 08:10:05 azure systemd[1]: n0099-vnstat-alert.service: Main process exited, code=exited, status=1/FAILURE
Apr 13 08:10:05 azure systemd[1]: n0099-vnstat-alert.service: Failed with result 'exit-code'.
Apr 13 08:10:05 azure systemd[1]: Failed to start n0099-vnstat-alert.service.

and it will exit with some non-zero exit code cause false-positive of alerting.

@vergoh
Copy link
Owner

vergoh commented Apr 13, 2024

Have you tested if setting DatabaseWriteAheadLogging to 1 (and then restarting the daemon) solves the situation?

@n0099
Copy link
Author

n0099 commented Apr 13, 2024

I'll try this, also I'm using zfs on /var/lib, is this issue similar to atuinsh/atuin#952?

@vergoh
Copy link
Owner

vergoh commented Apr 14, 2024

I'd don't have any practical experience on using zfs myself. I quick search for sqlite and zfs suggests that enabling WAL (which is what DatabaseWriteAheadLogging does) can help. You should also check the daemon logs at it should produce warnings whenever database write take longer than 4 seconds. The frequency of those warnings before and after the DatabaseWriteAheadLogging configuration change should provide some indication if that helps or not. The read timeout is configured at compile time with DBREADTIMEOUTSECS to 5 seconds.

@n0099
Copy link
Author

n0099 commented Apr 14, 2024

$ sudo journalctl -u vnstat -g took
Apr 11 18:30:16 azure vnstatd[2169]: Warning: Writing cached data to database took 39.1 seconds.
Apr 11 18:30:16 azure vnstatd[2169]: Warning: Writing cached data to database took 7.5 seconds.
-- Boot c954ea177ebd4c759c98ef0e1fa04a87 --
Apr 13 03:20:11 azure vnstatd[1889]: Warning: Writing cached data to database took 4.3 seconds.
-- Boot c9c52571a5c7441f95967b2ca23cda41 --
Apr 13 08:10:06 azure vnstatd[1941]: Warning: Writing cached data to database took 6.3 seconds.
Apr 13 12:45:06 azure vnstatd[1941]: Warning: Writing cached data to database took 5.2 seconds.

@vergoh
Copy link
Owner

vergoh commented Apr 17, 2024

Let me know if that setting helped with the read situation or not as I'm not exactly sure if having WAL enabled also avoids getting the read-only error while slow writes are being done at the same time. With ZFS, the DatabaseSynchronous setting may also be one possibility to investigate as part of the slow writes issue with sqlite could be due to sqlite trying to ensure the writes have completed while ZFS is doing that also internally at the same time resulting in unnecessary multiplied checks.

As for improving the detectability of the source of the exit status that's usually evaluated when --alert is used, I'll see if adding exit options 4 and 5 which would match the current 2 and 3 but using exit status 2 (instead of 1 that all the other errors use too) would be the ideal solution or if some sort of --actual-errors-do-not-exit-1 parameter would be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants