Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publishing ports explicitly to private networks should not be accessible from LAN hosts #45610

Open
polarathene opened this issue May 25, 2023 · 6 comments
Labels
area/networking/portmapping area/networking area/security kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage

Comments

@polarathene
Copy link
Contributor

polarathene commented May 25, 2023

Description

With docker run --rm -d -p 127.0.0.1:8081:80 traefik/whoami, it is not expected that hosts on a common network could access the service, but they can due to iptables rules managed by Docker:

For example, on the docker host (192.168.42.10), the container could be reached at 127.0.0.1:8081 or 172.17.0.3:80 from a separate host at 192.168.42.20.

Reproduce

Run the following commands:

  • Docker host (192.168.42.10):

    Technically only the 2nd container (8081) is relevant for reproduction:

    • Public is expected to be accessible via LAN IP.
    • Internal is not expected to be accessible outside of the docker host.
    • Private is not accessible due to no published port (unless FORWARD chain is set to ACCEPT)
    # Default binding address: `0.0.0.0`:
    docker run --rm -d -p 8080:80 --name public traefik/whoami
    
    # Explicitly only accessible internally via localhost (or container IP: 172.17.0.3:80):
    docker run --rm -d -p 127.0.0.1:8081:80 --name internal traefik/whoami
    
    # Only reachable via container IP (172.17.0.4:80):
    docker run --rm -d --name private traefik/whoami
  • Neighbour host (192.168.42.20, same LAN):

    # NOTE: Firewalld prevents access via docker zone
    # Route to `docker0` bridge at the docker host:
    ip route add 172.17.0.0/16 via 192.168.42.10
    
    # LAN host successfully connects to container at docker host via published container port:
    curl 172.17.0.3:80
    # Route to 127.0.0.1 at the docker host:
    ip addr add 127.0.0.2/8 dev lo
    ip addr del 127.0.0.1/8 dev lo
    ip route add 127.0.0.1 via 192.168.42.10
    # NOTE: Alternatively `all` could instead be the common LAN interface (eg: `eth1`):
    sysctl net.ipv4.conf.all.route_localnet=1
    
    # LAN host successfully connects to container at docker host via published host port:
    curl 127.0.0.1:8081

nmap can be used to identify what ports are reachable at the docker host (only the published ports from Docker).

Expected behavior

I did not expect hosts on the same network to be able to reach private subnets of a separate host like 127.0.0.1 or 172.17.0.0/16.

Ports bound on 127.0.0.1 via services not running in containers were not reachable due to no equivalent iptables rules permitting it AFAIK? Unclear if Docker could be more restrictive on that routing requirement.

docker version

Click to view
Client: Docker Engine - Community
 Version:           24.0.1
 API version:       1.43
 Go version:        go1.20.4
 Git commit:        6802122
 Built:             Fri May 19 18:07:52 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.1
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.4
  Git commit:       463850e
  Built:            Fri May 19 18:06:17 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.21
  GitCommit:        3dce8eb055cbb6872793272b4f20ed16117344f8
 runc:
  Version:          1.1.7
  GitCommit:        v1.1.7-0-g860f061
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Click to view
Client: Docker Engine - Community
 Version:    24.0.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.4
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.18.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 3
  Running: 3
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 24.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: v1.1.7-0-g860f061
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.2.12-200.fc37.x86_64
 Operating System: Fedora Linux 37 (Server Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 947.4MiB
 Name: vpc-fedora
 ID: 91d9ebe9-0988-4d55-9030-e7cff48f5dd2
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

Reproduced with:

  • userland-proxy enabled / disabled.
  • Ubuntu 23.04 (UFW) and Fedora 37 (Firewalld, docker zone prevents bridge access)
  • Two VMware guests (Arch Linux, no firewall) on a NAT network.
  • Two VPS instances on Vultr connected with a VPC private network (some vendors apparently prevent this traffic).

The scope is apparently limited to layer 2 network switching (not my area of expertise). Some other cloud providers like AWS aren't affected with their VPC by default, requiring opt-in config.

Cause and mitigation options

Docker changes (sysctl net.ipv4.ip_forward=1 + iptables chain rules for FORWARD => DOCKER) to support the published ports, this is a side-effect.

  • Routing to 127.0.0.1 can be mitigated via additional constraint to the PREROUTING NAT rule.
  • Routing to 172.16.0.0/12 (or similar docker networks) can at least be mitigated via Firewalld docker zone.

Related past vulnerability with ip_forward=1

A related issue was reported in the past years ago and resolved. The vulnerability is present at a smaller scope (only access to containers via explicitly published ports, instead of all container ports), but may pose risk (indirect access to such containers on the docker hosts VPN network?).

This is presumably still a valid concern on untrusted networks (cafe / airport wifi), or trusted networks (home / corporate) if a LAN host were compromised.

You can find comments within existing issues from many years ago detailing how to perform this (a recent example (Nov 2021)), as well as other sources outside of the moby repo that discuss it. Thus the public report, as this had already been disclosed publicly?


Related past vulnerability from route_localnet=1 elsewhere

This was initially for investigating when userland-proxy: false sets sysctl net.ipv4.conf.docker0.route_localnet=1. Which doesn't appear to be a risk (as detailed in point 3 here).

I had seen a similar vulnerability in kubernetes which I wanted to verify (I had less familiarity with route_localnet at the time):

That vulnerability although similar differs:

  • It allowed reaching any port on 127.0.0.1 via the common LAN interface route, while the moby one is constrained to published ports only.
  • moby is only setting route_localnet=1 for it's own bridge networks (not all interfaces with sysctl net.ipv4.conf.all.route_localnet=1 on the docker host - although narrowing that down to the common LAN interface would be sufficient).
  • Both Firewall frontends protect against the attack (unpublished ports on 127.0.0.1):
    • UFW (sets INPUT default policy DROP)
    • Firewalld (LAN interface in a zone with target: default)
@polarathene polarathene added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels May 25, 2023
@polarathene
Copy link
Contributor Author

polarathene commented May 25, 2023

Workaround

In the meantime for those concerned, it can be prevented by:

  • For requests to <container IP>:<container port>:
    a. Use Firewalld (due to the docker zone).
    b. UFW lacks the Firewalld zone feature/integration, but manual forwarding rules seem to work:

    # For each docker network interface, deny forwarding traffic from the LAN (eg: `enp6s0`):
    # Probably should be more specific if you intend for published ports to be accessible via LAN
    iptables -I DOCKER-USER -i docker0 -o enp6s0 -j DROP
    iptables -I DOCKER-USER -o docker0 -i enp6s0 -j DROP
  • For requests to curl <LAN IP>:<host port>, Neither firewall prevents it out of the box (manual rules required?):

    # Avoid applying DNAT rules too early when destination is `127.0.0.1` (delay until OUTPUT chain):
    # https://askubuntu.com/questions/579231/whats-the-difference-between-prerouting-and-forward-in-iptables/579242#579242
    iptables -t nat -D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
    iptables -t nat -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER ! -d 127.0.0.1

For some scenarios, you may have alternative options that do not require port publishing. Or the upcoming IPVS NAT alternative may be suitable when paired with a firewall :)


If anyone would like to pursue a PR, the PREROUTING rule could be adjusted here:

case Nat:
preroute := []string{
"-m", "addrtype",
"--dst-type", "LOCAL",
"-j", c.Name}

Probably a bit more to it and I don't know for sure if that minor addition to exclude 127.0.0.1 would cause regressions elsewhere 😅

I'm not sure how the other adjustment could be approached 😅 (perhaps documentation would be easier)

@polarathene polarathene changed the title LAN hosts can route to published container ports restricted to a docker hosts private networks Publishing ports explicitly to private networks should not be accessible from LAN hosts May 26, 2023
@neersighted
Copy link
Member

@polarathene Thanks for the high quality issue! I think this might be a duplicate (this is definitely well known with the existing network model), but I couldn't find it; in any case you've added enough of a detailed model here that we might want to use this issue as the reference going forward.

Are you on the community Slack by any chance? If you could join there, I'd like to put you more directly in touch with the folks working on improving the networking stack.

@polarathene
Copy link
Contributor Author

polarathene commented May 26, 2023

I think this might be a duplicate (this is definitely well known with the existing network model), but I couldn't find it

I am aware of two issues that are similar:

While this issue itself is about publishing only on 127.0.0.1, and being able to access those published ports at 127.0.0.1 or equivalent at the containers bridge IP 172.17.0.0/16.

If there was another issue tracking this same problem, nothing stood out to me when I did a quick search for existing issues 😅


Are you on the community Slack by any chance?

I am now :) I sent you a DM there 👍

@polarathene
Copy link
Contributor Author

polarathene commented Jun 2, 2023

When fixed, the networking docs have a reference to this issue that can be dropped: docker/docs#17176 (comment)

@neersighted
Copy link
Member

#14041 is what I had in mind -- nice find 😀

gzm0 added a commit to gzm0/scala-js-website that referenced this issue Sep 23, 2023
By default, published docker ports "bind" to all inbound addresses. We
restrict to localhost to avoid exposing the site to the internet.

Note that malicious same L2 participants can still reach the container
due to:
moby/moby#45610
@fherenius
Copy link

Any updates or progress related to this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking/portmapping area/networking area/security kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage
Projects
None yet
Development

No branches or pull requests

4 participants