Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server rejecting ELB connections resulting in intermittent ELB 502 error #867

Open
shishir-22 opened this issue Nov 29, 2023 · 2 comments
Open

Comments

@shishir-22
Copy link

Hi,

We are using mod_wsgi to host our Flask application on EC2 servers.

Recently, we encountered an issue where some of our API requests were failing with a 502 error. Upon investigation, we discovered that the requests were not reaching the EC2 servers from the ELB. After further investigation with AWS support, we learned that the target closed the connection with a TCP RST or a TCP FIN while the load balancer had an outstanding request to the target.

To resolve this, they recommended setting the keep-alive timeout of the server to a higher value than the ELB timeout. Our ELB timeout is 120 seconds, and according to Apache documentation, it's advised not to set high values for KeepAliveTimeout as it might result in performance issues in highly loaded servers (source).

After reviewing this documentation, I understood that RequestReadTimeout header=20-40 can also cause connections to close between ELB and the server.

Here is our mod_wsgi configuration:

LoadModule wsgi_module "/usr/local/lib64/python3.7/site-packages/mod_wsgi/server/mod_wsgi-py37.cpython-37m-x86_64-linux-gnu.so"
WSGIPythonHome "/usr"
ServerSignature Off
ServerTokens Full
TimeOut 60
WSGIPAssAuthorization On
Header unset Server
SecServerSignature "Application"
SecRequestBodyNoFilesLimit 1111111
SecRequestBodyInMemoryLimit 1111111

<IfModule mod_reqtimeout.c>
  RequestReadTimeout header=20-40,MinRate=500 body=20-40,MinRate=500
</IfModule>

Could you help clarify a few queries:

  1. Is there any timeout setting in mod_wsgi that can help in fixing this issue?
  2. What are your views on keeping a high value for 'KeepAliveTimeout,' and what impact could it have on the wsgi application?

Regards
Shishir

@GrahamDumpleton
Copy link
Owner

If there was an individual outstanding HTTP request still active and it was being dropped before the response was returned, KeepAliveTimeout is not relevant. That setting only pertains to whether a socket connection is kept around for a time between separate HTTP requests over the one socket connection. Thus the keep alive timeout feature should not result in active individual HTTP requests being interrupted, it should only trigger in the quiet period between HTTP requests on a socket connection.

So I think AWS is confusing things with the terminology they are using.

What is relevant is the lower level socket timeout (not HTTP keep alive timeout). This lower level socket timeout is what the Timeout directive in Apache is for. Currently you have Timeout set to 60, which is below the ELB timeout.

So set Timeout to be higher.

Either leave KeepAliveTimeout set as 0 so is disabled, if makes sense, or at most set it to 5 seconds. High keep alive timeouts is generally not a good idea as can cause capacity problems.

@shishir-22
Copy link
Author

Thank you for your quick response and clarification. Right, it does make sense to not keep the session alive between ELB and server for a huge period without any requests by setting the KeepAliveTimeout to more than 120 seconds.

Let me try by increasing the Timeout value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants