Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with connection timeouts and multihreadding #247

Open
agcmoser opened this issue Apr 29, 2022 · 5 comments
Open

Problems with connection timeouts and multihreadding #247

agcmoser opened this issue Apr 29, 2022 · 5 comments

Comments

@agcmoser
Copy link

Hello,

I have a code where I write data into a Postgres db (using LibPQ.jl) which runs in parallel task. I had general issues with multithreading but hose could be solved, see https://discourse.julialang.org/t/no-real-multithreading-possible/77920.

Now I can observe that if there is a connection timeout to my Postgres-db make a libpq call taking longer, the thread calling libpq library is blocking all other threads. I am wondering if multithreading is only working correctly with pure julia code or if this is an issue with LibPQ

Any hint is welcome.
TIA, agcm

@iamed2
Copy link
Owner

iamed2 commented Apr 29, 2022

There are a number of things that can cause blocking; a traceback from an interrupt when the blocking is happening would be useful to narrow it down. I'm also not sure how I would tell the difference between a blocking C library call and monopolizing a thread (which sounds like it can happen from the linked discussions?), without instrumenting LibPQ (which you could try on your machine).

One thing is if you're using a hostname, then DNS lookups can block: https://www.postgresql.org/docs/10/libpq-connect.html#LIBPQ-PARAMKEYWORDS

I'm afraid I haven't kept up with multithreading in Julia so I'm not sure if there's a difference between running on the "main thread" or another thread, and I don't know whether async code can monopolize a thread or something. I can say that, as is, LibPQ.jl connections should yield as appropriate while connecting as long as the restrictions are met and the standard library's poll_fd yields properly.

@agcmoser
Copy link
Author

agcmoser commented May 2, 2022

@iamed2: Thanks for the fast reply and especially for the hint with the DNS lookup because: we actually use two different Postgres Servers: one, which is accessible via the internet using and DNS and one accessible via a VPN extranet and its VPN IP. The one access via VPN and its IP so fare never had the described problem. The DNS one does. The two servers are not the same one and hosted by different provides (most probably also not the same version of Postgres server application) so we first though it be rather an issue of the different servers and not the DNS lookup. I will try accessing the DNS based server using its IP to verify that it is actually that ... but it seems like so.

@iamed2
Copy link
Owner

iamed2 commented May 2, 2022

To deal with this you can perform the DNS lookup in advance and cache it using the stdlib:

julia> using Sockets  # stdlib

julia> getalladdrinfo("google.com")  # all IPs, useful to handle failover
2-element Vector{IPAddr}:
 ip"142.250.191.110"
 ip"2607:f8b0:4009:817::200e"

julia> getaddrinfo("google.com")  # first IP
ip"142.250.191.110"

Then pass that as hostaddr instead of host in the DSN.

@agcmoser
Copy link
Author

agcmoser commented May 3, 2022

@iamed2: Good idea. I will incorporate that in my code and give feedback once it's active again and runs without errors (or if the error still exists)

@agcmoser
Copy link
Author

agcmoser commented May 5, 2022

I think I still have the same issue, also when using the hostaddr instead of the host. I rather think it's a timeout problem. In the documentation here I saw that the default value of the connection parameter connect_timeout means wait indefinitely ... I have now set connect_timeout=10 in the LibPQ connection string and will evaluate if things are better now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants