Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular collector oom-kill at midnight UTC on Sundays #400

Open
scottybrisbane opened this issue Mar 22, 2023 · 1 comment
Open

Regular collector oom-kill at midnight UTC on Sundays #400

scottybrisbane opened this issue Mar 22, 2023 · 1 comment

Comments

@scottybrisbane
Copy link

We are running pganalyze collectors on some of our larger EC2 self-hosted database instances and are seeing the collectors getting consistently oom-killed on Sundays at around midnight UTC. This issue occurs only on our larger instances (for example EC2 instance types i4i.4xlarge and i4i.16xlarge) and not on smaller instances. We also don't see this oom-kill at any other time during the week, aside from midnight UTC on Sundays.

We are making a change to increase the memory limit on the systemd service, but wanted to raise this issue as well as it seems unusual given the consistent timing.

Here are some logs from dmesg showing the timing and how much memory is being used by the collector when it is killed:

[Sun Feb 26 00:00:49 2023] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/system.slice/pganalyze-collector.service,task_memcg=/system.slice/pganalyze-collector.service,task=pganalyze-colle,pid=2621967,uid=115
[Sun Feb 26 00:00:49 2023] Memory cgroup out of memory: Killed process 2621967 (pganalyze-colle) total-vm:6779672kB, anon-rss:1037104kB, file-rss:1112kB, shmem-rss:0kB, UID:115 pgtables:2844kB oom_score_adj:0
...
[Sun Mar  5 00:00:47 2023] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/system.slice/pganalyze-collector.service,task_memcg=/system.slice/pganalyze-collector.service,task=pganalyze-colle,pid=1603081,uid=115
[Sun Mar  5 00:00:47 2023] Memory cgroup out of memory: Killed process 1603081 (pganalyze-colle) total-vm:6106612kB, anon-rss:1037936kB, file-rss:1204kB, shmem-rss:0kB, UID:115 pgtables:2884kB oom_score_adj:0
...
[Sun Mar 12 00:01:59 2023] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/system.slice/pganalyze-collector.service,task_memcg=/system.slice/pganalyze-collector.service,task=pganalyze-colle,pid=750683,uid=115
[Sun Mar 12 00:01:59 2023] Memory cgroup out of memory: Killed process 750683 (pganalyze-colle) total-vm:6038448kB, anon-rss:1038556kB, file-rss:0kB, shmem-rss:0kB, UID:115 pgtables:3044kB oom_score_adj:0
...
[Sun Mar 19 00:00:59 2023] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/system.slice/pganalyze-collector.service,task_memcg=/system.slice/pganalyze-collector.service,task=pganalyze-colle,pid=3846386,uid=115
[Sun Mar 19 00:00:59 2023] Memory cgroup out of memory: Killed process 3846386 (pganalyze-colle) total-vm:5897128kB, anon-rss:1037724kB, file-rss:632kB, shmem-rss:0kB, UID:115 pgtables:2932kB oom_score_adj:0

The pganalyze collector logs don't contain anything unusual around these times.

@lfittl
Copy link
Member

lfittl commented Mar 24, 2023

@scottybrisbane Thanks for the report!

It might be easier to talk about this in detail in a support ticket - that way we can share more specific information about your installation. Could you open a ticket through the in-app "Get help" functionality, if you haven't done so yet? Thanks!

(happy to update this issue afterwards in case there is something worth sharing more generally)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants