Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERR_HTTP2_SERVER_REFUSED_STREAM with PHP web application on 0.9.0 #1642

Open
alexharrington opened this issue May 27, 2021 · 26 comments
Open

Comments

@alexharrington
Copy link

alexharrington commented May 27, 2021

I'm running Xibo (which is PHP/Apache) behind nginx-proxy with the LetsEncrypt helper.

Normally no issues, but today I did a new install and on about 30% of my page loads (mainly javascript files), Chrome 90 gives an error loading those resources ERR_HTTP2_SERVER_REFUSED_STREAM. Refreshing the page gets the error on a different number of those URLs. If I access the URLs individually they work as expected.

Dropping down to 0.8.0 fixes it, and going back to 0.9.0 causes it again, so I can only presume it's something changing in nginx?

To replicate, you'd need a docker-compose setup like this:

version: "2.1"

services:
    cms-db:
        image: mysql:5.7
        volumes:
            - "./shared/db:/var/lib/mysql:Z"
        environment:
            - MYSQL_DATABASE=cms
            - MYSQL_USER=cms
            - MYSQL_RANDOM_ROOT_PASSWORD=yes
            - MYSQL_PASSWORD=password123
    cms-web:
        image: xibosignage/xibo-cms:release-2.3.10
        volumes:
            - "./shared/cms/custom:/var/www/cms/custom:Z"
            - "./shared/backup:/var/www/backup:Z"
            - "./shared/cms/web/theme/custom:/var/www/cms/web/theme/custom:Z"
            - "./shared/cms/library:/var/www/cms/library:Z"
            - "./shared/cms/web/userscripts:/var/www/cms/web/userscripts:Z"
            - "./shared/cms/ca-certs:/var/www/cms/ca-certs:Z"
        links:
            - cms-db:mysql
        environment:
            - XMR_HOST=cms-xmr
            - VIRTUAL_HOST=cms.example.org
            - LETSENCRYPT_HOST=cms.example.org
            - LETSENCRYPT_EMAIL=user@example.org
            - HTTPS_METHOD=noredirect
            - SERVER_TOKENS=off
            - MYSQL_PASSWORD=password123
    cms-proxy:
        image: jwilder/nginx-proxy:0.9.0
        ports:
            - 80:80
            - 443:443
        volumes:
            - ./shared/proxy/certs:/etc/nginx/certs:ro
            - /var/run/docker.sock:/tmp/docker.sock:ro
            - /usr/share/nginx/html
        labels:
            - com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy
        environment:
            - "DEFAULT_HOST=cms.example.org"
            - "DHPARAM_GENERATION=false"
    cms-letsencrypt:
        image: jrcs/letsencrypt-nginx-proxy-companion
        volumes_from:
            - cms-proxy
        volumes:
            - ./shared/proxy/certs:/etc/nginx/certs:rw
            - /var/run/docker.sock:/var/run/docker.sock:ro

Once the containers are started, log in at https://cms.example.org - username xibo_admin, password password and check the Chrome network access tab in developer tools.

@buchdag
Copy link
Member

buchdag commented May 27, 2021

@alexharrington could you try with nginxproxy/nginx-proxy:1642 (same as main branch but with #1641 merged in) ?

@alexharrington
Copy link
Author

I get the same issue there I'm afraid:
image

@buchdag
Copy link
Member

buchdag commented Jun 8, 2021

@alexharrington #1644 will be merged by the end of the week and will allow to disable HTTP/2 entirely.

@alexharrington
Copy link
Author

Thanks, although HTTP/2 works fine on 0.8.0.

@buchdag
Copy link
Member

buchdag commented Jun 8, 2021

Could you try mounting a file to /etc/nginx/conf.d/custom_proxy_settings.conf, containing the following line :

http2_max_concurrent_streams 256;

The exact name of the file isn't important, it just have to end with .conf and be mounted inside /etc/nginx/conf.d.

@alexharrington
Copy link
Author

That doesn't seem to make any difference as far as I can see.

I have a test VPS I've just setup to try your suggestion above. If you want me to provide you access to that temporarily then I'd be happy to?

@buchdag
Copy link
Member

buchdag commented Jun 9, 2021

I'm skimming through the diff between 0.8.0 and 0.9.0, the only possibly impacting changes were the upgrade from nginx 1.19.3 to 1.19.10 and the addition of sed -i 's/worker_connections 1024/worker_connections 10240/' /etc/nginx/nginx.conf in the Dockerfile.

Also I noticed this in the doc :

nginxproxy/nginx-proxy:alpine

This image is based on the nginx:alpine image. Use this image to fully support HTTP/2 (including ALPN required by recent Chrome versions).

I have no idea why this was added at the time and if it is still relevant but it might be worth a shot to try with nginxproxy:nginx-proxy:0.9.0-alpine. If this still does not work I'll push other images built with nginx versions prior to 1.19.10, if you're okay to test them.

@alexharrington
Copy link
Author

alexharrington commented Jun 9, 2021

Same issue with nginxproxy:nginx-proxy:0.9.0-alpine

Can I try lowering the worker connections setting somehow to test that? I tried directly editing /etc/nginx/nginx.conf inside the proxy container, and then restarting, but I still have the same issue.

Very happy to test anything you wish. Thanks so much for your help with this.

@buchdag
Copy link
Member

buchdag commented Jun 9, 2021

I've just pushed nginxproxy:nginx-proxy:1642 again.

It's identical to the main branch except for a downgrade to nginx 1.19.3.

@alexharrington
Copy link
Author

Thank you. nginxproxy/nginx-proxy:1642 seems to work OK.

@buchdag
Copy link
Member

buchdag commented Jun 9, 2021

I've pushed the following images:

nginxproxy/nginx-proxy:1642-1194 : nginx 1.19.4
nginxproxy/nginx-proxy:1642-1195 : nginx 1.19.5
nginxproxy/nginx-proxy:1642-1196 : nginx 1.19.6
nginxproxy/nginx-proxy:1642-1197 : nginx 1.19.7
nginxproxy/nginx-proxy:1642-1198 : nginx 1.19.8
nginxproxy/nginx-proxy:1642-1199 : nginx 1.19.9

Could you test them and tell me with which version of nginx does the issue start to appear ?

@alexharrington
Copy link
Author

alexharrington commented Jun 9, 2021

nginxproxy/nginx-proxy:1642-1194 : nginx 1.19.4 - works OK
nginxproxy/nginx-proxy:1642-1195 : nginx 1.19.5 - works OK
nginxproxy/nginx-proxy:1642-1196 : nginx 1.19.6 - works OK

nginxproxy/nginx-proxy:1642-1197 : nginx 1.19.7 - Several net::ERR_CONNECTION_RESET and net:ERR_CONNECTION_CLOSED errors

nginxproxy/nginx-proxy:1642-1198 : nginx 1.19.8 - Several net::ERR_CONNECTION_RESET and net:ERR_CONNECTION_CLOSED errors

nginxproxy/nginx-proxy:1642-1199 : nginx 1.19.9 - Same problem as originally reported above net::ERR_HTTP2_SERVER_REFUSED_STREAM

@buchdag
Copy link
Member

buchdag commented Jun 10, 2021

Changes with nginx 1.19.7 16 Feb 2021

*) Change: connections handling in HTTP/2 has been changed to better
   match HTTP/1.x; the "http2_recv_timeout", "http2_idle_timeout", and
   "http2_max_requests" directives have been removed, the
   "keepalive_timeout" and "keepalive_requests" directives should be
   used instead.

*) Change: the "http2_max_field_size" and "http2_max_header_size"
   directives have been removed, the "large_client_header_buffers"
   directive should be used instead.

*) Feature: now, if free worker connections are exhausted, nginx starts
   closing not only keepalive connections, but also connections in
   lingering close.

*) Bugfix: "zero size buf in output" alerts might appear in logs if an
   upstream server returned an incorrect response during unbuffered
   proxying; the bug had appeared in 1.19.1.

*) Bugfix: HEAD requests were handled incorrectly if the "return"
   directive was used with the "image_filter" or "xslt_stylesheet"
   directives.

*) Bugfix: in the "add_trailer" directive.

Changes with nginx 1.19.8 09 Mar 2021

*) Feature: flags in the "proxy_cookie_flags" directive can now contain
   variables.

*) Feature: the "proxy_protocol" parameter of the "listen" directive,
   the "proxy_protocol" and "set_real_ip_from" directives in mail proxy.

*) Bugfix: HTTP/2 connections were immediately closed when using
   "keepalive_timeout 0"; the bug had appeared in 1.19.7.

*) Bugfix: some errors were logged as unknown if nginx was built with
   glibc 2.32.

*) Bugfix: in the eventport method.

Changes with nginx 1.19.9 30 Mar 2021

*) Bugfix: nginx could not be built with the mail proxy module, but
   without the ngx_mail_ssl_module; the bug had appeared in 1.19.8.

*) Bugfix: "upstream sent response body larger than indicated content
   length" errors might occur when working with gRPC backends; the bug
   had appeared in 1.19.1.

*) Bugfix: nginx might not close a connection till keepalive timeout
   expiration if the connection was closed by the client while
   discarding the request body.

*) Bugfix: nginx might not detect that a connection was already closed
   by the client when waiting for auth_delay or limit_req delay, or when
   working with backends.

*) Bugfix: in the eventport method.

Changes with nginx 1.19.10 13 Apr 2021

*) Change: the default value of the "keepalive_requests" directive was
   changed to 1000.

*) Feature: the "keepalive_time" directive.

*) Feature: the $connection_time variable.

*) Workaround: "gzip filter failed to use preallocated memory" alerts
   appeared in logs when using zlib-ng.

Relevant bits:

In 1.19.8

Bugfix: HTTP/2 connections were immediately closed when using "keepalive_timeout 0"; the bug had appeared in 1.19.7.

In 1.19.9

Bugfix: nginx might not close a connection till keepalive timeout expiration if the connection was closed by the client while discarding the request body.

Bugfix: nginx might not detect that a connection was already closed by the client when waiting for auth_delay or limit_req delay, or when working with backends.

And maybe this in 1.19.10

Change: the default value of the "keepalive_requests" directive was changed to 1000.

Feature: the "keepalive_time" directive.

Feature: the $connection_time variable.

@buchdag
Copy link
Member

buchdag commented Jun 10, 2021

Additional info:

We're not setting keepalive_timeout in the template so it has the default value of 75s.

We're not setting auth_delay, limit_req, keepalive_requests or keepalive_time either.

We were not using any http2_* directives.

@buchdag
Copy link
Member

buchdag commented Jun 10, 2021

https://trac.nginx.org/nginx/ticket/2155

It turns out no browsers implement HTTP/2 GOAWAY handling properly, and
large enough number of resources on a page results in failures to load
some resources. In particular, Chrome seems to experience errors if
loading of all resources requires more than 1 connection (while it
is usually able to retry requests at least once, even with 2 connections
there are occasional failures for some reason), Safari if loading requires
more than 3 connections, and Firefox if loading requires more than 10
connections (can be configured with network.http.request.max-attempts,
defaults to 10).

It does not seem to be possible to resolve this on nginx side, even strict
limiting of maximum concurrency does not help, and loading issues seems to
be triggered by merely queueing of a request for a particular connection.
The only available mitigation seems to use higher keepalive_requests value.

Final closing message:

Unfortunately, it is not possible to completely resolve the this on nginx side. Proper solution would be to fix GOAWAY handling in browsers.

@alexharrington
Copy link
Author

If it's useful, Chrome shows 44 requests made to load the page that I've been testing with. Certainly nowhere near the thousands of resources they're talking about in that issue.

It sounds though like it's browser side, so I think we'll disable HTTP2 when that option is available to us, and then look to re-enable it down the line when hopefully Chrome gets fixed.

Thanks for your help on this.

@buchdag
Copy link
Member

buchdag commented Jun 10, 2021

I figured from your screenshot that you're far from the ~1000 requests mentioned in the nginx issue, but it seems to be triggered even with a low requests count in some corner cases:

loading issues seems to be triggered by merely queueing of a request for a particular connection

Have you tried with other browsers ?

@alexharrington
Copy link
Author

alexharrington commented Jun 11, 2021

I hadn't tried it no, but I have now.

Firefox 89 seems to have a similar issue. The page loads, but some of the CSS/JS is missing with 0.9.0. The page takes a long time to load compared to normal.

I can't see a response code, but I can see that the browser spent some time in the "sending" phase but never got a response.

image

image

@AndreasDeCrinis
Copy link

I can reproduce the exact same behavior with the nginx ingress controller using the bitnami helm charts https://github.com/bitnami/charts/tree/master/bitnami/nginx-ingress-controller

a workaround which works for me is to set the keep-alive value to "1" instead of "0"

@buchdag
Copy link
Member

buchdag commented Jun 24, 2021

@AndreasKappel which nginx keep alive variable specifically ?

@alexharrington
Copy link
Author

Adding keepalive_timeout 1; to vhost.d/default does seem to resolve it for me too.

@buchdag
Copy link
Member

buchdag commented Jul 23, 2021

What would be the possible implications of adding keepalive_timeout 1 to the template for those who aren't affected by this issue ?

@Fabiencdp
Copy link

We recently noticed the same problem, it happen on chromium browser for the moment.
The setting keepalive_timeout 1 work well, but we must deactivate it because as requested by an SSO auth provider.

Here is any news about this issue ? i tried the last docker image and i also get the problem when keepalive_timeout is 0.

@Fabiencdp
Copy link

Using the version 1642 seems to fix the problem, even with keepalive_timout 0

@spacecat
Copy link

spacecat commented Jul 29, 2023

I solved my ERR_HTTP2_SERVER_REFUSED_STREAM by setting a higher value for keep-alive-requests in my ingress-nginx:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ingress-nginx-mse-controller
  namespace: mse
data:
  enable-brotli: "true"
  use-http2: "true"
  keep-alive-requests: "9999"

The default value for keep-alive-requests is 1000.

http://nginx.org/en/docs/http/ngx_http_core_module.html#keepalive_requests

If you want to specify "unlimited" there is no such value; just set a high value. The higher the value the more memory your server will consume.

More info here: https://serverfault.com/a/425130/241371 and here: https://trac.nginx.org/nginx/ticket/2155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants