port mapping disappear after 'intensive' use #8817

superbob · 2014-10-28T10:28:29Z

I have a rabbitmq-server docker container created from a self made image (Dockerfile) that expose 3 ports : 4369, 5672, 15672.
I want to use it through a client application that does a lot of tests (open and close a lot of connection in a short time span).
When I use through my client application, it work correctly at first, but after a short time (20-30s), my client application start receiving only "Connection Refused" errors and one of the port mapping disappear.
The initial docker run command I used was :

docker run -i -p 5672:5672 -p 15672:15672 -p 4369:4369 --name="rabbitmq-server" -t rabbitmq-server

I created the container some month ago and it is not running continuously.
So I start it every day with a docker start rabbitmq-server command.
After starting it, I see the ports mapped in my host interface with netstat :

$ sudo netstat -nlp | grep docker-proxy                                                                                                                                         [10:55:52]
tcp        0      0 :::15672                :::*                    LISTEN      26875/docker-proxy  
tcp        0      0 :::5672                 :::*                    LISTEN      26892/docker-proxy  
tcp        0      0 :::4369                 :::*                    LISTEN      26884/docker-proxy

(I filtered the output to only interesting ports)
After seeing "Connection Refused" errors, 5672 misses the port mapping :

$ sudo netstat -nlp | grep docker-proxy                                                                                                                                         [10:59:02]
tcp        0      0 :::15672                :::*                    LISTEN      26875/docker-proxy  
tcp        0      0 :::4369                 :::*                    LISTEN      26884/docker-proxy

(I filtered the output to only interesting ports)
Despite this, the container works fine, other ports are working correctly, the web interface (port 15672) is good.
If I try to connect directly to the container IP it works also :

$ telnet 172.17.0.5 5672                                                                                                                                                        [11:05:21]
Trying 172.17.0.5...
Connected to 172.17.0.5.
Escape character is '^]'.

It is only the port mapping that disappear.

Here's a capture showing the throughput from rabbitmq :

It started consuming at 10:58:35 and there was a break down at 10:58:55 that correspond to the moment since I started receiving "Connection Refused" errors. At its best it was consuming more that 100 msg/s.

It might be related to : #8022 and/or #8428

Host info :

$ docker info                                                                                                                                                                   [11:10:46]
Containers: 4
Images: 74
Storage Driver: devicemapper
 Pool Name: docker-8:2-10226587-pool
 Pool Blocksize: 65.54 kB
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 8.267 GB
 Data Space Total: 107.4 GB
 Metadata Space Used: 7.504 MB
 Metadata Space Total: 2.147 GB
 Library Version: 1.03.01 (2011-10-15)
Execution Driver: native-0.2
Kernel Version: 3.11.10-21-default
Operating System: openSUSE 13.1 (Bottle) (x86_64) (containerized)

$ docker version                                                                                                                                                                [11:07:31]
Client version: 1.3.0
Client API version: 1.15
Go version (client): go1.3.1
Git commit (client): c78088f
OS/Arch (client): linux/amd64
Server version: 1.3.0
Server API version: 1.15
Go version (server): go1.3.1
Git commit (server): c78088f

The text was updated successfully, but these errors were encountered:

LK4D4 · 2014-10-28T19:59:23Z

Seems like docker-proxy just dying. We probably fixed this in master.

superbob · 2014-11-04T07:41:30Z

Same issue in 1.3.1

$ docker version
Client version: 1.3.1
Client API version: 1.15
Go version (client): go1.3.3
Git commit (client): 4e9bbfa
OS/Arch (client): linux/amd64
Server version: 1.3.1
Server API version: 1.15
Go version (server): go1.3.3
Git commit (server): 4e9bbfa

superbob · 2014-12-05T08:30:25Z

I changed the way my client application uses the rabbitmq-server docker container so that it is less "intensive".

Now I don't reproduce the problem, but I can't tell if the issue is still there.

I don't know if this issue should be closed.

gdm85 · 2015-01-02T14:50:18Z

I have an issue which can be modeled around this problem too; I'll try to write a test for the situation you described

superbob · 2015-01-05T13:30:38Z

Thank you @gdm85

jessfraz · 2015-02-26T23:31:26Z

can you check if it is fixed for 1.5

leverly · 2015-02-28T08:04:00Z

It is still not fixed for 1.5.

thaJeztah · 2015-03-01T14:18:06Z

@cpuguy83 probably needs label "bug" too?

stpkys · 2015-10-06T12:19:31Z

Still problem
docker-proxy dies silently on high number of concurrent connections

Client version: 1.7.1
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): 786b29d/1.7.1
OS/Arch (client): linux/amd64
Server version: 1.7.1
Server API version: 1.19
Go version (server): go1.4.2
Git commit (server): 786b29d/1.7.1
OS/Arch (server): linux/amd64

Increasing opened files limit helped (ulimit -n 65535), but it would be great if docker-proxy log it somehow.

BlackGlory · 2020-09-23T02:47:46Z

It can still be reproduced on 19.03.11 and 19.03.12.

docker run -d -p 8080:80 --name my-nginx nginx:1.18

netstat -ntulp | grep 8080
# tcp6    0     0    :::8080     :::*     LISTEN  6237/docker-proxy

# at the beginning of this benchmark, my-nginx can still be accessed through localhost:8080
wrk -t12 -c1000 -d30s http://localhost:8080

netstat -ntulp | grep 8080
# empty

wrk -t12 -c1000 -d30s http://localhost:8080
# unable to connect to localhost:8080 Connection refused

# get my-nginx's ip
docker inspect my-nginx
# my-nginx can be accessed through container_ip:80
wrk -t12 -c1000 -d30s http://172.17.0.2:80

The stackoverflow question:
https://stackoverflow.com/questions/64014595/the-docker-container-loses-port-forwarding-after-running-benchmarks

thaJeztah · 2020-09-23T15:01:38Z

Thanks for that additional information, @BlackGlory. From that output, it seems like the docker-proxy process for the container is gone.

I wonder if the system was under memory pressure and if because of that the kernel's OOM killer kicked in, and killed the docker-proxy process for that container.

Looking at that process;

docker run -d --name foo -p 8070:80 nginx:alpine

# (I only have a single container running on this test machine)
pidof docker-proxy
4607

cat /proc/4607/oom_adj
-8

I see that the docker-proxy has a default oom-score-adj of -8, which means that, although it's slightly adjust to prevent it from being killed (default is 0), and it has a lower score than the container itself (which doesn't adjust score by default), it's possible that if the system is under memory pressure, that the kernel OOM killer killed the proxy.

I haven't checked yet where the OOM-score-adj for docker-proxy is set, but I also checked if the -8 score is adjusted if the container itself is configured to have a lower oom-score-adj;

# remove the container
docker rm -f foo

# create a new container with a negative OOM-score-adjust
docker run -d --name foo -p 8070:80 --oom-score-adj=-200


pidof docker-proxy
5122

cat /proc/5122/oom_adj
-8

So from that, that doesn't appear to be the case. Perhaps we should adjust the OOM-score-adj to be relative to the container's score, so that they (more likely) are either both killed (container including the proxy), or both kept up, otherwise the container keeps running in a somewhat defunct state (ports not accessible).

Note that I think the docker-proxy is only needed to facilitate hairpin connections from the host itself. If you're on a modern distro, you may be able to run without, and configure the docker daemon to disable spinning up docker-proxy processes for each container.

thaJeztah · 2020-09-23T15:09:10Z

Also see #14856 w.r.t. the docker-proxy, and #5618, which is a kernel-bug (but should be fixed in recent kernels) that prevented us from disabling it by default).

@BlackGlory if you're consistently able to reproduce the issue on your test-system, would you be able to check if the process was killed by the kernel's OOM killer? You should be able to find log-entries for this in your system log; https://stackoverflow.com/a/15953500/1811501

dmesg | egrep -i 'killed process'

or

grep -i 'killed process' /var/log/messages

thaJeztah · 2020-09-23T15:20:53Z

Afaics, the code doesn't exlicitly set a oom.score_adj; https://github.com/moby/libnetwork/blob/d0ae17dcfaa1f21e3b0f5d55bba4239f08489640/portmapper/proxy_linux.go#L10

https://github.com/moby/libnetwork/blob/d0ae17dcfaa1f21e3b0f5d55bba4239f08489640/portmapper/proxy.go#L34-L38

BlackGlory · 2020-09-23T17:20:57Z

@thaJeztah It doesn't seem to be about the kernel's OOM killer.

dmesg | egrep -i 'killed process'
# empty
grep -i 'killed process' /var/log/syslog
# empty

cjdcordeiro · 2021-01-08T10:27:55Z

+1 - also having this issue on a RPi 3B+, with Docker 18.09.1.

Also, fyi, in case someone wants to workaround this, and given that the docker-proxy process is restarted alongside a container restart:

define a restart policy for the container (e.g. on-failure)
install and add tini as an entrypoint to the container (e.g. ENTRYPOINT ["/sbin/tini", "--"])
set an healthcheck. Assuming you're running some API server on host port XYZ, then your HEALTHCHECK command would look something like curl -f http://$(route -n | grep 'UG[ \t]' | awk '{print $2}'):XYZ 2>&1 || (kill $(pgrep tini) && exit 1). This $(route ...) instruction is basically infer the gateway IP, thus sending the curl request directly to the host, where the container service should be published on port XYZ. If the docker-proxy is down, this curl will get connection refused and the healthcheck command will terminate the container's main process, thus causing a container restart (because of 1.) and subsequently a new docker-proxy process to be spawned

vasi26ro · 2022-09-01T05:18:04Z

+1 also having the same issue with docker Docker version 20.10.14, build a224086349 installed via snap.
After switching to docker installed with apt this behavior has disappeared.

cpuguy83 added the exp/expert label Feb 28, 2015

jessfraz added bug labels Mar 2, 2015

spf13 added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. /system/networking exp/expert and removed /system/networking exp/expert labels Mar 21, 2015

jessfraz added /system/networking and removed /system/networking labels Jul 10, 2015

jessfraz added area/networking and removed system/networking labels Sep 8, 2015

icecrime added the version/1.3 label May 12, 2016

sorki mentioned this issue Feb 2, 2021

Nix baliky / integracia cityvizor/cityvizor#315

Closed

3 tasks

lrascao mentioned this issue Mar 20, 2023

docker-proxy get killed when many request #45152

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

port mapping disappear after 'intensive' use #8817

port mapping disappear after 'intensive' use #8817

superbob commented Oct 28, 2014

LK4D4 commented Oct 28, 2014

superbob commented Nov 4, 2014

superbob commented Dec 5, 2014

gdm85 commented Jan 2, 2015

superbob commented Jan 5, 2015

jessfraz commented Feb 26, 2015

leverly commented Feb 28, 2015

thaJeztah commented Mar 1, 2015

stpkys commented Oct 6, 2015

BlackGlory commented Sep 23, 2020

thaJeztah commented Sep 23, 2020 •

edited

thaJeztah commented Sep 23, 2020

thaJeztah commented Sep 23, 2020

BlackGlory commented Sep 23, 2020 •

edited

cjdcordeiro commented Jan 8, 2021 •

edited

vasi26ro commented Sep 1, 2022

port mapping disappear after 'intensive' use #8817

port mapping disappear after 'intensive' use #8817

Comments

superbob commented Oct 28, 2014

LK4D4 commented Oct 28, 2014

superbob commented Nov 4, 2014

superbob commented Dec 5, 2014

gdm85 commented Jan 2, 2015

superbob commented Jan 5, 2015

jessfraz commented Feb 26, 2015

leverly commented Feb 28, 2015

thaJeztah commented Mar 1, 2015

stpkys commented Oct 6, 2015

BlackGlory commented Sep 23, 2020

thaJeztah commented Sep 23, 2020 • edited

thaJeztah commented Sep 23, 2020

thaJeztah commented Sep 23, 2020

BlackGlory commented Sep 23, 2020 • edited

cjdcordeiro commented Jan 8, 2021 • edited

vasi26ro commented Sep 1, 2022

thaJeztah commented Sep 23, 2020 •

edited

BlackGlory commented Sep 23, 2020 •

edited

cjdcordeiro commented Jan 8, 2021 •

edited