Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout when reading response headers from daemon process in 60 seconds #859

Open
shishir-22 opened this issue Sep 28, 2023 · 3 comments
Open

Comments

@shishir-22
Copy link

shishir-22 commented Sep 28, 2023

Hi,

I am using mod_wsgi-express with tornado application. I want to increase the API timeout limit to 120 seconds from 60 seconds.

Here is my mod_wsgi-express command to start the application :
mod_wsgi-express setup-server <code_path>/deploy.wsgi --server-root=/etc/mod_apache/AppName --user=ec2-user --document-root=<code_path> --port=13000 --modules-directory /usr/lib64/httpd/modules --compress-responses --limit-request-body 13107200 --log-level=INFO --request-timeout 120

I have set the request-timeout to 120 seconds.


Here is the content of my deploy.wsgi setting:
import sys
import site
import os

site.addsitedir('{instance_dir}/venv/lib/python3.7/site-packages')
sys.path.append('{instance_dir}/parent-folder1')
sys.path.append('{instance_dir}/parent-folder1/parent-folder2')
from main_service import app as application


Here is the content of my app.conf file present at /etc/httpd/conf.d/appname.conf:
ProxyPass /AppName http://localhost:13000/
ProxyPassReverse /AppName http://localhost:13000/
Timeout 120
ProxyTimeout 120

I have set the timeout and proxytime to 120 seconds.


Now when my request is taking more than 60 seconds, these timeouts are not working. My API from browser is getting timeout in 180 seconds(Load balancer timeout). When I checked the logs, I found out that the process is running for around 65 seconds but there is a line at 60 seconds "Timeout when reading response headers from daemon process 'localhost:13000': /var/www/AppName/<code_path>/client, referer: "


Logs:
[Thu Sep 28 07:20:04.320698 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:04,320 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 24
[Thu Sep 28 07:20:05.322052 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:05,321 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 25
[Thu Sep 28 07:20:06.323379 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:06,323 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 26
[Thu Sep 28 07:20:07.324701 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:07,324 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 27
[Thu Sep 28 07:20:08.326044 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:08,325 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 28
[Thu Sep 28 07:20:09.328141 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:09,327 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 29
[Thu Sep 28 07:20:10.329522 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:10,329 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 30
[Thu Sep 28 07:20:11.330826 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:11,330 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 31
[Thu Sep 28 07:20:12.332143 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:12,332 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 32
[Thu Sep 28 07:20:13.333466 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:13,333 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 33
[Thu Sep 28 07:20:14.334770 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:14,334 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 34
[Thu Sep 28 07:20:15.336084 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:15,335 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 35
[Thu Sep 28 07:20:16.336410 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:16,336 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 36
[Thu Sep 28 07:20:17.336955 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:17,336 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 37
[Thu Sep 28 07:20:18.338268 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:18,338 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 38
[Thu Sep 28 07:20:19.339716 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:19,339 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 39
[Thu Sep 28 07:20:20.341064 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:20,340 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 40
[Thu Sep 28 07:20:21.342391 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:21,342 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 41
[Thu Sep 28 07:20:22.343712 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:22,343 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 42
[Thu Sep 28 07:20:23.345038 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:23,344 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 43
[Thu Sep 28 07:20:24.346354 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:24,346 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 44
[Thu Sep 28 07:20:25.347675 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:25,347 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 45
[Thu Sep 28 07:20:26.348993 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:26,348 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 46
[Thu Sep 28 07:20:27.350320 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:27,350 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 47
[Thu Sep 28 07:20:28.351649 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:28,351 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 48
[Thu Sep 28 07:20:29.353005 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:29,352 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 49
[Thu Sep 28 07:20:30.354381 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:30,354 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 50
[Thu Sep 28 07:20:31.355718 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:31,355 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 51
[Thu Sep 28 07:20:32.357055 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:32,356 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 52
[Thu Sep 28 07:20:33.358380 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:33,358 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 53
[Thu Sep 28 07:20:34.359702 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:34,359 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 54
[Thu Sep 28 07:20:35.360956 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:35,360 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 55
[Thu Sep 28 07:20:36.362267 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:36,362 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 56
[Thu Sep 28 07:20:37.363588 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:37,363 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 57
[Thu Sep 28 07:20:38.364922 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:38,364 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 58
[Thu Sep 28 07:20:39.365429 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:39,365 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 59
[Thu Sep 28 07:20:40.366732 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:40,366 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 60
[Thu Sep 28 07:20:41.223165 2023] [wsgi:error] [pid 21411:tid 140613437949696] [client 127.0.0.1:37826] Timeout when reading response headers from daemon process 'localhost:13000': /var/www/AppName/<code_path>/client, referer:
[Thu Sep 28 07:20:41.368040 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:41,367 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 61
[Thu Sep 28 07:20:42.369356 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:42,369 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 62
[Thu Sep 28 07:20:43.370690 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:43,370 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 63
[Thu Sep 28 07:20:44.372007 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:44,371 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 64
[Thu Sep 28 07:20:45.373323 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:45,373 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - 65
[Thu Sep 28 07:20:46.396521 2023] [wsgi:error] [pid 20933:tid 140612837431040] [remote 127.0.0.1:37826] 2023-09-28 07:20:46,396 - Common.Helper.LogErrorHandling.Services.S3Logger_file.log - INFO - Started fetching Client details


I am not able to figure out that from where the process is picking 60 seconds timeout. Request from browser is active for 180 seconds as per load balancer timeout, process is running for around 65 seconds as shown in logs something is getting timeout at 60 seconds.

Can you please help me dubug the issue further. Let me know if more information is needed.

Regards
Shishir

@GrahamDumpleton
Copy link
Owner

FWIW, my recollection of Tornado apps are that they require an async server. Trying to use Tornado, although think it has a mode for doing so under WSGI, brings severe limitations as it doesn't run async and runs as single thread blocking server effectively from memory.

Anyway, the reason the request timeout isn't kicking in is explained in the docs.

If there is more than one thread, the request timeout is applied to the average running time for any requests, across all threads. This means that a request can run longer than the request timeout. This is done to reduce the possibility of interupting other running requests, and causing a user to see a failure. So where there is still capacity to handle more requests, restarting of the process will be delayed if possible.

In default setup daemon mode of mod_wsgi has 5 threads with mod_wsgi-express, so if there was only one long running request, could take up to 10 minutes.

Try with:

mod_wsgi-express start-server ... --processes=3 --threads=1

By forcing one thread per process and multiple process, then the request timeout will be 2 minutes.

@shishir-22
Copy link
Author

shishir-22 commented Sep 28, 2023

Hey Graham,

Thanks for your response

I tried with 3 processes and 1 thread but the issue remains same. My code is running for around 65-70 seconds but API is getting timeout in browser after 180 seconds.

Server URL : http://localhost:13000/
Server Root : /etc/mod_apache/AppName
Server Conf : /etc/mod_apache/AppName/httpd.conf
Error Log File : /etc/mod_apache/AppName/error_log (INFO)
Rewrite Rules : /etc/mod_apache/AppName/rewrite.conf
Environ Variables : /etc/mod_apache/AppName/envvars
Control Script : /etc/mod_apache/AppName/apachectl
Operating Mode : daemon
Request Capacity : 3 (3 processes * 1 threads)
Request Timeout : 120 (seconds)
Startup Timeout : 15 (seconds)
Queue Backlog : 100 (connections)
Queue Timeout : 45 (seconds)
Server Capacity : 20 (event/worker), 20 (prefork)
Server Backlog : 500 (connections)
Locale Setting : en_US.UTF-8
Starting server...

Since my code execution is done in 65-70 seconds, I am not able to understand why my request is running for whole 180 seconds. When my code is executing for less than 60 seconds, I am getting the proper response of my API.

Something is happening at 60 second which is causing my request to go unresponsive and eventually it is getting timeout by load balancer after 180 seconds.

There is only this line in application logs "Timeout when reading response headers from daemon process" at 60th second.

I found another line in HTTP logs(/var/log/httpd/error_log): "[proxy_http:error] [pid 11089] (70007)The timeout specified has expired: [client :25996] AH01102: error reading status line from remote server localhost:13000, referer: "

Is there any other timeout setting which can resolve the issue here?

@GrahamDumpleton
Copy link
Owner

Let me try and summarise a few things.

You are using:

So basically not using Tornado as async, but trying to live with its shortcoming when using it via WSGI. A mode of operation which isn't really recommended.

You are hosting this with mod_wsgi-express, which is an Apache httpd instance with mod_wsgi using daemon mode.

You are then using yet another Apache httpd instance as a proxy in front of mod_wsgi-express.

So a request is received by the front end Apache which proxies it to the backend Apache (run by mod_wsgi-express), which because it is running daemon mode, sees another proxy step occurring, albeit managed by mod_wsgi module and not using normal HTTP proxy.

Your Tornado app blocks and doesn't respond with any data at all within 60 seconds, which is because although you have set --request-timeout you haven't set either --socket-timeout or --response-socket-timeout resulting in "Timeout when reading response headers from daemon process" after default of 60 seconds.

In other words, request timeout never gets to come into play since your application is totally silent.

Part of the reason for --request-timeout is to deal with long running requests which may periodically respond with some data, but ultimately take longer than expected.

If you want to set the --request-timeout for very long values and your application is not going to respond with any interim data right up until it is finished, you also need to at least use --response-socket-timeout and set it to a higher value than --request-timeout. This is because if you don't the lower level socket timeout will cause the request to fail before the higher level request timeout triggers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants