New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
63k file descriptor used for 400 connetions - Too many open files #1037
Comments
I agree that that sounds like quite a lot of file descriptors for just 400 connections. Which PgBouncer version is this? And did you change the version when increasing the connection limits?
Was that command before increasing the file limit to 64000 for pgbouncer? If not, it seems like your increase didn't have the intended effect. |
What errors do you have in the postgres logs? Because if pgbouncer is opening the many files, than postgres should not be running out of files (because the limits are per process). Basically, can you share the exact errors from the logs? |
fs.file-max is what you want probably: https://www.tecmint.com/increase-set-open-file-limits-in-linux/ |
Strange, out of LimitNOFILE always looked like
not yep, it should be system-wise |
Then it's good. LimitNOFILE is indeed what you want to use when using systemd. I guess maybe you copied the wrong output of /proc/PID/limit in your first comment, because there it showed 16384 (I guess that was a postgres process not a pgbouncer process). |
also set
|
[Service] This one is the unit file of pgbouncer And I have another one for patroni with LimitNOFILE set to 16384 |
Is it possible that pgbouncer creates / duplicates clients connexions ?
With server_login_retry, I think pgbouncer must retry the query but maybe there is a problem there ? |
yeah so, it sounds like that should be set to a larger. Now that you're getting close to that number. It does seem like a fairly high limit already, but even without pgbouncer you already get ~50k. I'd say, set that to 655350 (ten times the current value). And then run your intended setup again. Then you can find out what is using your file descriptiors. But it sounds like it's mostly postgres. |
I understand what you mean about increase this limit but I still don't understand why pgbouncer used so much file descriptor for 400 connections like my supp shows when I have disabled pgbouncer, file descriptors downgrade from 64k to 3k and now 16k since one hour. The global problem seems related to pgbouncer -> uses too much fd and reaches the fs.file-max limit so postgres doesnt have enougth fd and log too many open files but I dont think it's related to postgres. and I prefer to check out this problem instead of upgrade the fs.file-max value because if pgbouncer has a fd leak or something, it will just consume RAM and CPU for nothing ? |
Yeah, I also don't understand why the number is so high. But if PgBouncer is not running, it's pretty much impossible to debug. So that's why I'm suggesting, configure the settings so that the system at least runs. And then figure out what is causing so many file descriptors to be used using a combination of Also what was your pgbouncer config before changing it? Because your file descriptor pattern even after disabling pgbouncer seems very different from the (pretty much) flat line it was before. |
Ok I will reenable the ssetup this nigth, could you give me some point to check / debug. |
pgbouncer in verbose mode ? |
I think mostly using some os tools ( |
ok I wll try it this week and give you a feedback. Thanks for your help. See you |
Hello again :) I still have an high consumption of file descriptor when there is a spike of client connections...
also somes stats:
2688 queries/s is high ? at 17:36. It's higher than others pgbouncer log stats but how it causes 70k file descriptor ... All crash with nouser ?
ok again
Do you have some ways to resolve this or other things you need to debug ? Thanks |
Can you run this command to see the number of open files per process?
|
oh good shot, nice command btw ;)
Add a lot of file descriptor like
cat /tmp/test | grep "/var/lib/postgresql/12/main/base/" | wc -l So it's the process postgres that opens a lot of filedescriptor but the spikes happens only if pgbouncer is started. When I have disabled pgbouncer, I have no spike (maybe because max_db_connections was set to 500 so lot of connection get dropped ?) this is my postgresql conf:
Could you give me some ways to debug this ? I m sure you know better postgres than me :). Thank you again ! |
Okay, so yeah those ~400 file descriptors for pgbouncer itself seem quite normal. Why postgres has so many open is probably because you have set:
So each process can open 4096 files, multiplied by the 120 default_pool_size that could run up to 125*4096: 512000 open files in the worst case. The most probable reason why pgbouncer makes this problem more pronounced is that it will keep connections to postgres open, thus a postgres process that opened a lot of files will keep them open much longer than if your client does:
Because now the disconnect does not happen, and the disconnect would close all filedescriptors automatically without pgbouncer in the picture. So I don't really think there's a huge problem going on here, just increase You could also lower One question though is: Why is the spike so high right after changing to pgbouncer, and why does it go back down after a while. I think this might have to do with you restarting postgres too at that point. And thus If that's indeed the case you probably want to enable pg_prewarm. That way a single background worker would fill the cache, instead of having the cache be filled by all backends. You can enable pg_prewarm by adding this to your postgresql.conf:
Source: https://www.postgresql.org/docs/current/pgprewarm.html |
Oh im very grateful ! To be sure, when a client connect to pgbouncer -> create/reuse a pool connection that connect to postgres -> create a postgres process and this process can create / open 4096 files. Ok but in my debug, I found that postgres connection only need one or two files descriptor ? I missunderstood ? "One question though is: Why is the spike so high right after changing to pgbouncer, and why does it go back down after a while" Ok I wll increase gradually the fs.file-max and monitor the behavior. |
Yeah, so I guess Postgres doesn't always close the files it opens in a process. And by pgbouncer caching the connections to postgres, these files stay open longer, and then the next query might open more files, which again stay open. Thus eventually adding up to a lot of files per postgres process. |
Should I let the issue opened and give a feedback next week ? |
Having not received a reply, closed. |
Hi, Sorry for the feedback, I have upgrade the fs.file-max to 400k with pg_prewarm and downgrade a little pool_connection to 100 per database. I Have some spike to 150k fd open but it doesnt impact RAM or CPU iin my supp and no more too many open files ERROR so quite good for me. Thanks again. |
Hi,
Im facing a difficult issue with pgbouncer and postgres in production. I administrate a BDD cluster (haproxy + patroni + pgbouncer + postgres) for 4 years (I'm not a begginer) and now I need to upgrade default connexions settings because we have more clients.
I have three baremetal server (32GB RAM + 12VCPU). On each one, I have haproxy + pgbouncer + patroni + postgres.
My problem is simple but I dont have any explanation:
When I have a spike connections (around 400 which is not incredible ^^), my supp show me 63k file descriptor open on the server. I can't split this metric by process but I have ERROR in pgbouncer and postgres logs like
Too many open files
with server_login_retry for pgbouncerFirst of all I have increase the open file limits for pgbouncer (64000) and postgres process (16384) but it still reaches this limit ...
I could increase again but I want to understand why pgbouncer are uses so much file descriptors for 400 connexions ?
I read that a client connexion open one file descriptor so pgbouncer should have needed 400 fd ?
pgboucer_config:
max open files pgbouncer:
max open files postgres:
I set up max connexion for postgres to 500.
For now, I deactivate pgbouncer so my clients connect directy to the postgres database, not recommended in production but I have no choice ...
I didn't find any issues like this for pgbouncer ...
Thanks for you help
The text was updated successfully, but these errors were encountered: