Replies: 10 comments
-
Use inject-mode(aggregate-only)
but please note that grouping-by() will store all messages belonging to the
same context for 10 seconds. That can be a lot.
…On Wed, Sep 20, 2023 at 1:32 PM Peter Czanik ***@***.***> wrote:
I got quite a lot of questions, how to drop log messages, when the
identical lines are not right next to each other. A kind of next level
supress()...
My initial idea was using the grouping-by() parser with RAWMSG to find
identical messages. But how can drop identical log messages? My current
configuration just creates extra logs instead of dropping duplicate log
messages...
source s_net {
tcp(port(514) flags(store-raw-message));
};
destination d_noparse {
file("/var/log/noparse");
};
log {
source(s_net);
destination(d_noparse);
};
destination d_group {
file("/var/log/group");
};
parser p_supress {
grouping-by(
key("${RAWMSG}")
aggregate(
value("MESSAGE" "bla bla bla ${RAWMSG}")
)
timeout(10)
);
};
log {
source(s_net);
parser(p_supress);
destination(d_group);
};
—
Reply to this email directly, view it on GitHub
<#4639>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFOK5U6M2MT3AKYVMC3UADX3LH4HANCNFSM6AAAAAA47YMTXU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Bazsi
|
Beta Was this translation helpful? Give feedback.
-
Thanks! It is completely undocumented, but works as expected in my simple test. It is a long requested feature, so I'll do some stress testing and check some corner cases. But it looks promising. OK, found it. It was introduced during 3.38 development... #3998 |
Beta Was this translation helpful? Give feedback.
-
There might be some memory leak. At least my assumption was that if timeout is over and there are no incoming logs, then memory is freed. However, in practice RAM usage stayed the same or reduced only minimally even after 10+ minutes. Using a git snapshot from yesterday. I'll do some more tests with more realistic logs, as currently I send the same 50 lines in loop using loggen. It works as expected with a UDP source or with a single TCP connection, but sometimes there are extra logs in the output when using multiple threads in loggen. My current config is:
|
Beta Was this translation helpful? Give feedback.
-
Still not yet realistic logs, but I ran into another interesting thing. I sent some logs using
I double checked, and all lines are different, as you can see. Could it also explain the 5G memory usage of syslog-ng?
|
Beta Was this translation helpful? Give feedback.
-
Another fun. I send fifty lines in a loop to the above config. Forty-five of those are unique. Still, the output of grouping-by() only contains 43 lines:
|
Beta Was this translation helpful? Give feedback.
-
And another one, when sending the same 50 logs in loop over multiple active connections:
|
Beta Was this translation helpful? Give feedback.
-
On Thu, Sep 21, 2023, 09:53 Peter Czanik ***@***.***> wrote:
There might be some memory leak. At least my assumption was that if
timeout is over and there are no incoming logs, then memory is freed.
However, in practice RAM usage stayed the same or reduced only minimally
even after 10+ minutes. Using a git snapshot from yesterday.
The system malloc does not return memory to the system too eagerly, so even
if syslog-ng does free() allocations they remain allocated if you look at
the process with ps.
We are using jemalloc in the AxoSyslog container as that's both faster and
releases memory better.
I'll do some more tests with more realistic logs, as currently I send the
same 50 lines in loop using loggen.
It works as expected with a UDP source or with a single TCP connection,
but sometimes there are extra logs in the output when using multiple
threads in loggen.
Loggen can be buggy too, sending it over multiple threads should not matter.
My current config is:
czplaptop:/etc/syslog-ng/conf.d # cat supress.conf
source s_net {
tcp(port(514) flags(store-raw-message));
};
destination d_noparse {
file("/var/log/noparse");
};
log {
source(s_net);
destination(d_noparse);
};
destination d_group {
file("/var/log/group");
};
parser p_supress {
grouping-by(
key("${RAWMSG}")
aggregate(
value("RAWMSG" "${RAWMSG}")
This probably includes a timestamp, which will cause all messages to go
into separate buckets.
…
)
inject-mode(aggregate-only)
timeout(10)
);
};
log {
source(s_net);
parser(p_supress);
destination(d_group);
};
czplaptop:/etc/syslog-ng/conf.d #
—
Reply to this email directly, view it on GitHub
<#4639 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFOK5T2J62WWXFMSLZ3ZH3X3PXAJANCNFSM6AAAAAA47YMTXU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi,
On Thu, Sep 21, 2023, 12:57 Peter Czanik ***@***.***> wrote:
Still not yet realistic logs, but I ran into another interesting thing. I
sent some logs using loggen. As all lines are different, I should get the
same number of logs in both files. However it's not the case:
czplaptop:~ # wc -l /var/log/noparse /var/log/group
17513636 /var/log/noparse
13539782 /var/log/group
31053418 total
czplaptop:~ # sort /var/log/noparse | uniq | wc -l
17513636
Did you wait 10 seconds after loggen stopped? That's how long it takes for
grouping-by() to evict its state.
I double checked, and all lines are different, as you can see. Could it
also explain the 5G memory usage of syslog-ng?
czplaptop:~ # ps aux | grep syslog-ng
root 22988 54.6 18.4 6488232 5998400 ? Ssl 12:46 5:11 /usr/sbin/syslog-ng -F
grouping-by() will store copies of messages in its context for as long as
timeout expires.
You sent ~17m messages, each taking ~1kb of memory which would be 17gb
—
… Reply to this email directly, view it on GitHub
<#4639 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFOK5WNBOAYGC4555WSDJTX3QMQ3ANCNFSM6AAAAAA47YMTXU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I don't know when you run those wc commands. But you did tell grouping-by()
that it should wait 10 seconds for the entire group to be collected. So
once you stop loggen, please wait that 10 seconds and then check the
outputs.
…On Thu, Sep 21, 2023, 13:23 Peter Czanik ***@***.***> wrote:
And another one, when sending the same 50 logs in loop over multiple
active connections:
czplaptop:~ # loggen -i -S -R /root/mylogs --active-connections=10 -r 10000 -d -l -I 120 localhost 514
[...]
count=11466918, rate = 92808.01 msg/sec
count=11514366, rate = 94882.34 msg/sec
count=11560441, rate = 92133.97 msg/sec
average rate = 96322.23 msg/sec, count=11560441, time=120.018, (average) msg size=137, bandwidth=12913.19 kB/sec
czplaptop:~ # wc -l /var/log/noparse /var/log/group
11560441 /var/log/noparse
75 /var/log/group
11560516 total
—
Reply to this email directly, view it on GitHub
<#4639 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFOK5RECXT4CUBZLM4RY73X3QPSXANCNFSM6AAAAAA47YMTXU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I ran I had a test file, with logs like these:
And used
Without When I generated log messages using |
Beta Was this translation helpful? Give feedback.
-
I got quite a lot of questions, how to drop log messages, when the identical lines are not right next to each other. A kind of next level supress()...
My initial idea was using the grouping-by() parser with RAWMSG to find identical messages. But how can drop identical log messages? My current configuration just creates extra logs instead of dropping duplicate log messages...
Beta Was this translation helpful? Give feedback.
All reactions