-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk queue looses message on rsyslog restart #5364
Comments
Hmm, I wonder if the lost messages are sent to the buffer to the omprog binary,
but not read there.
This may be a case where you need to get a debug log from shortly before
shutdown (so we can see what's 'in flight' to the program) through restart.
I think Rainer or someone else at Adiscon will need to look into this in detail.
(it depends how easy it is to duplicate the problem)
David Lang
…On Sun, 14 Apr 2024, Dmitry Shevtsov wrote:
Date: Sun, 14 Apr 2024 09:21:54 -0700
From: Dmitry Shevtsov ***@***.***>
Reply-To: rsyslog/rsyslog
***@***.***>
To: rsyslog/rsyslog ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [rsyslog/rsyslog] Disk queue looses message on rsyslog restart (Issue
#5364)
### Expected behavior
The disk queue with fsync of every message and checkpointInterval do not loose the message on the rsyslog restart.
### Actual behavior
The disk queue looses the message on rsyslog restart.
### Steps to reproduce the behavior
1. Configure the omprog module that sends the message to the binary's stdin
module(load="omprog")
if $msg contains 'session opened' then {
action(type="omprog"
binary="/path/to/binary"
confirmMessages="on"
confirmTimeout="2000"
signalOnClose="off"
closeTimeout="10000"
killUnresponsive="on"
forceSingleInstance="off"
fileCreateMode="0600"
output="/var/log/error.log"
action.resumeRetryCount="-1"
action.resumeInterval="2"
queue.spoolDirectory="/var/log/rsyslog"
queue.filename="rsyslog-queue"
queue.size="10000"
queue.type="Disk"
queue.syncqueuefiles="on"
queue.checkpointInterval="1"
queue.workerThreads="1"
queue.maxFileSize="64m"
queue.timeoutshutdown="5000"
queue.timeoutActionCompletion="2000"
queue.saveOnShutdown="on")
}
2. Write the binary that prints "OK" on startup then reads stdin (respecting EOF) and negative acknowledge all the messages with writing anything else than "OK" to stdout and error in stderr (to check logs in the following steps).
3. Login as any user to the system, this should trigger the message action
4. Observe in the rsyslog's logs that the message is retried (or configure impstats and check that the queue size became 1)
5. Restart the rsyslog with `systemctl restart rsyslog`
6. Observe that the message is no longer retried (impstats prints the queue size is 0, no rsyslog-queue disk queue is in the /var/log/rsyslog/ folder)
### Environment
RHEL 9.3
rsyslogd 8.2102.0-117.el9 (aka 2021.02) compiled with:
PLATFORM: x86_64-redhat-linux-gnu
PLATFORM (lsb_release -d):
FEATURE_REGEXP: Yes
GSSAPI Kerberos 5 support: Yes
FEATURE_DEBUG (debug build, slow code): No
32bit Atomic operations supported: Yes
64bit Atomic operations supported: Yes
memory allocator: system default
Runtime Instrumentation (slow code): No
uuid support: Yes
systemd support: Yes
Config file: /etc/rsyslog.conf
PID file: /var/run/rsyslogd.pid
Number of Bits in RainerScript integers: 64
### rsyslog configuration file
global(workDirectory="/var/lib/rsyslog")
module(load="builtin:omfile" Template="RSYSLOG_TraditionalFileFormat")
module(load="imuxsock" # provides support for local system logging (e.g. via logger command)
SysSock.Use="off") # Turn off message reception via local log socket;
# local messages are retrieved through imjournal now.
module(load="imjournal" # provides access to the systemd journal
UsePid="system" # PID nummber is retrieved as the ID of the process the journal entry originates from
StateFile="imjournal.state") # File to store the position in the journal
include(file="/etc/rsyslog.d/*.conf" mode="optional")
*.info;mail.none;authpriv.none;cron.none /var/log/messages
authpriv.* /var/log/secure
mail.* -/var/log/maillog
cron.* /var/log/cron
*.emerg :omusrmsg:*
uucp,news.crit /var/log/spooler
local7.* /var/log/boot.log
|
it would be useful to have a minimal debug log covering such an incident.
Rainer
El lun, 15 abr 2024 a las 4:56, David Lang ***@***.***>)
escribió:
… Hmm, I wonder if the lost messages are sent to the buffer to the omprog
binary,
but not read there.
This may be a case where you need to get a debug log from shortly before
shutdown (so we can see what's 'in flight' to the program) through restart.
I think Rainer or someone else at Adiscon will need to look into this in
detail.
(it depends how easy it is to duplicate the problem)
David Lang
On Sun, 14 Apr 2024, Dmitry Shevtsov wrote:
> Date: Sun, 14 Apr 2024 09:21:54 -0700
> From: Dmitry Shevtsov ***@***.***>
> Reply-To: rsyslog/rsyslog
> ***@***.***>
> To: rsyslog/rsyslog ***@***.***>
> Cc: Subscribed ***@***.***>
> Subject: [rsyslog/rsyslog] Disk queue looses message on rsyslog restart
(Issue
> #5364)
>
> ### Expected behavior
> The disk queue with fsync of every message and checkpointInterval do not
loose the message on the rsyslog restart.
>
> ### Actual behavior
> The disk queue looses the message on rsyslog restart.
>
> ### Steps to reproduce the behavior
> 1. Configure the omprog module that sends the message to the binary's
stdin
>
> module(load="omprog")
> if $msg contains 'session opened' then {
> action(type="omprog"
> binary="/path/to/binary"
> confirmMessages="on"
> confirmTimeout="2000"
> signalOnClose="off"
> closeTimeout="10000"
> killUnresponsive="on"
> forceSingleInstance="off"
> fileCreateMode="0600"
> output="/var/log/error.log"
>
> action.resumeRetryCount="-1"
> action.resumeInterval="2"
>
> queue.spoolDirectory="/var/log/rsyslog"
> queue.filename="rsyslog-queue"
> queue.size="10000"
> queue.type="Disk"
> queue.syncqueuefiles="on"
> queue.checkpointInterval="1"
> queue.workerThreads="1"
> queue.maxFileSize="64m"
> queue.timeoutshutdown="5000"
> queue.timeoutActionCompletion="2000"
> queue.saveOnShutdown="on")
> }
>
> 2. Write the binary that prints "OK" on startup then reads stdin
(respecting EOF) and negative acknowledge all the messages with writing
anything else than "OK" to stdout and error in stderr (to check logs in the
following steps).
>
> 3. Login as any user to the system, this should trigger the message
action
>
> 4. Observe in the rsyslog's logs that the message is retried (or
configure impstats and check that the queue size became 1)
>
> 5. Restart the rsyslog with `systemctl restart rsyslog`
>
> 6. Observe that the message is no longer retried (impstats prints the
queue size is 0, no rsyslog-queue disk queue is in the /var/log/rsyslog/
folder)
>
> ### Environment
> RHEL 9.3
>
> rsyslogd 8.2102.0-117.el9 (aka 2021.02) compiled with:
> PLATFORM: x86_64-redhat-linux-gnu
> PLATFORM (lsb_release -d):
> FEATURE_REGEXP: Yes
> GSSAPI Kerberos 5 support: Yes
> FEATURE_DEBUG (debug build, slow code): No
> 32bit Atomic operations supported: Yes
> 64bit Atomic operations supported: Yes
> memory allocator: system default
> Runtime Instrumentation (slow code): No
> uuid support: Yes
> systemd support: Yes
> Config file: /etc/rsyslog.conf
> PID file: /var/run/rsyslogd.pid
> Number of Bits in RainerScript integers: 64
>
> ### rsyslog configuration file
>
> global(workDirectory="/var/lib/rsyslog")
> module(load="builtin:omfile" Template="RSYSLOG_TraditionalFileFormat")
> module(load="imuxsock" # provides support for local system logging (e.g.
via logger command)
> SysSock.Use="off") # Turn off message reception via local log socket;
> # local messages are retrieved through imjournal now.
> module(load="imjournal" # provides access to the systemd journal
> UsePid="system" # PID nummber is retrieved as the ID of the process the
journal entry originates from
> StateFile="imjournal.state") # File to store the position in the journal
>
> include(file="/etc/rsyslog.d/*.conf" mode="optional")
> *.info;mail.none;authpriv.none;cron.none /var/log/messages
> authpriv.* /var/log/secure
> mail.* -/var/log/maillog
> cron.* /var/log/cron
> *.emerg :omusrmsg:*
> uucp,news.crit /var/log/spooler
> local7.* /var/log/boot.log
>
>
—
Reply to this email directly, view it on GitHub
<#5364 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALJ3C4M7SQCL4OF5QZOT6LY5M6XPAVCNFSM6AAAAABGGHWDOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJUGU3DKMRTGM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Expected behavior
The disk queue with fsync of every message and checkpointInterval do not loose the message on the rsyslog restart.
Actual behavior
The disk queue looses the message on rsyslog restart.
Steps to reproduce the behavior
Write the binary that prints "OK" on startup then reads stdin (respecting EOF) and negative acknowledge all the messages with writing anything else than "OK" to stdout and error in stderr (to check logs in the following steps).
Login as any user to the system, this should trigger the message action
Observe in the rsyslog's logs that the message is retried (or configure impstats and check that the queue size became 1)
Restart the rsyslog with
systemctl restart rsyslog
Observe that the message is no longer retried (impstats prints the queue size is 0, no rsyslog-queue disk queue is in the /var/log/rsyslog/ folder)
Environment
RHEL 9.3
rsyslog configuration file at
/etc/rsyslog.conf
The text was updated successfully, but these errors were encountered: