Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to connect opa-docker-authz.sock #51

Open
ramapalani opened this issue Sep 16, 2020 · 19 comments
Open

Unable to connect opa-docker-authz.sock #51

ramapalani opened this issue Sep 16, 2020 · 19 comments

Comments

@ramapalani
Copy link

I'm trying to run OPA docker plugin as part of Daemonset DIND (docker-in-docker).
Followed steps in this tutorial: https://www.openpolicyagent.org/docs/latest/docker-authorization/#goals

Only rule that in the rego file is to prevent privileged containers. This works as expected in a pre-prod environment. When we run this in prod env, it works as expected for about an hour, after that OPA plugin is not reachable. Docker logs has messages like these

time="2020-09-06T19:08:06.723350267Z" level=warning msg="Unable to connect to plugin: /run/docker/plugins/e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8/opa-docker-authz.sock/AuthZPlugin.AuthZReq: Post http://%2Frun%2Fdocker%2Fplugins%2Fe680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8%2Fopa-docker-authz.sock/AuthZPlugin.AuthZReq: dial unix /run/docker/plugins/e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8/opa-docker-authz.sock: connect: connection refused, retrying in 1s"

time="2020-09-06T19:08:21.759791345Z" level=error msg="Handler for POST /v1.39/images/create returned error: plugin openpolicyagent/opa-docker-authz-v2:0.7 failed with error: Post http://%2Frun%2Fdocker%2Fplugins%2Fe680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8%2Fopa-docker-authz.sock/AuthZPlugin.AuthZReq: dial unix /run/docker/plugins/e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8/opa-docker-authz.sock: connect: connection refused"

Daemonset definition:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: dind-daemonset
spec:
...
  template:
    spec:
      containers:
      - name: dind
        image: docker:18.09.5-dind
        command: ['sh', '-c', 'if [ -d /var/run/dind/docker.sock ]; then rm -rf /var/run/dind/docker.sock;fi && /usr/local/bin/dockerd-entrypoint.sh dockerd --storage-driver=overlay2 -H unix:///var/run/dind/docker.sock']
        lifecycle:
          postStart:
            exec:
              command: ["/bin/sh", "-c", "mkdir -p /etc/docker/policies && cp /etc/docker/opa-policy/authz.rego /etc/docker/policies && docker -H unix:///var/run/dind/docker.sock plugin install --grant-all-permissions openpolicyagent/opa-docker-authz-v2:0.7 opa-args=\"-policy-file /opa/policies/authz.rego\" && echo '{ \"authorization-plugins\": [\"openpolicyagent/opa-docker-authz-v2:0.7\"] }' > /etc/docker/daemon.json && kill -HUP $(pidof dockerd)"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: varlibdocker
          mountPath: /var/lib/docker
        - name: rundind
          mountPath: /var/run/dind
        - name: opa-policy
          mountPath: /etc/docker/opa-policy
...
      volumes:
      - name: varlibdocker
        emptyDir: {}
      - name: opa-policy
        configMap: 
          name: docker-opa-policy
      - name: rundind
        hostPath:
          path: /var/run/dind/

authz.rego/

apiVersion: v1
kind: ConfigMap
metadata:
  name: docker-opa-policy
data:
  authz.rego: |-
    package docker.authz

    default allow = false

    allow {
        not input.Body.HostConfig.Privileged
    }
@ashutosh-narkar
Copy link
Member

Are there any other logs ? Any more information from running docker plugin inspect ?

@ashutosh-narkar
Copy link
Member

Also what's different between the pre-prod and prod environments ?

@ramapalani
Copy link
Author

ramapalani commented Sep 17, 2020 via email

@ashutosh-narkar
Copy link
Member

In that case, have you tried allotting more resources to check if the system is not exhausted ?

@ramapalani
Copy link
Author

ramapalani commented Sep 17, 2020 via email

@ramapalani
Copy link
Author

Actual consumption screenshot
image

@ashutosh-narkar
Copy link
Member

Memory usage typically depends on the size of the data and policy that you load into OPA. This page provides more details on resource utilization. Do you have an estimate of these values ?

@ramapalani
Copy link
Author

This is the policy, it just evaluates only one field.

    package docker.authz

    default allow = false

    allow {
        not input.Body.HostConfig.Privileged
    }

I don't control the data, docker sends the input data to OPA plugin

Here is a sample input data with Body as null.

time="2020-09-05T19:40:15Z" level=error msg="2020/09/05 19:40:15 {\"config_hash\":\"f418bd1c862c2178ff5c93054aa8c8adae2ddae3aa90a68e4011c07d396839d4\",\"decision_id\":\"78c32ebd-a216-4ea1-a971-acbc879df361\",\"input\":{\"AuthMethod\":\"\",\"Body\":null,\"Headers\":{\"Accept-Encoding\":\"gzip\",\"Connection\":\"close\",\"User-Agent\":\"go-dockerclient\"},\"Method\":\"GET\",\"Path\":\"/images/sha256:xxxxxxcc040e350e848dd39bf1cabc09653adb7ede6f050cbd16a7503de6/json\",\"User\":\"\"},\"labels\":{\"app\":\"opa-docker-authz\",\"id\":\"b6b53359-69d3-45e8-acbf-b7258ea848cf\",\"opa_version\":\"v0.18.0\",\"plugin_version\":\"0.7\"},\"result\":true,\"timestamp\":\"2020-09-05T19:40:15.136801273Z\"}" plugin=e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8

When Body is not null, data is around 6kb.

In 60 minutes OPA docker plugin processed around 2000 request.

Is there a way for me to control the size of the data?

@ramapalani
Copy link
Author

@ashutosh-narkar Can you suggest a way to reduce the data or another way to avoid this 'huge' memory consumption by OPA docker plugin?

@ashutosh-narkar
Copy link
Member

The data seems pretty small. Have you documented OPA's memory usage with time ? And also how much memory have you allocated so far ?

@ramapalani
Copy link
Author

resource request is 4GB, but the actual usage went upto 25GB and then connection to scoket is lost. So we had to start docker-DIND without OPA plugin to get it working back

@ashutosh-narkar
Copy link
Member

@ramapalani Can you provide an example of how to reproduce the issue ? Any scripts that you have to simulate the traffic etc. would be helpful.

@ramapalani
Copy link
Author

I'll try to reproduce this in our pre-prod environment and share it with you.

@ramapalani
Copy link
Author

@ashutosh-narkar I'm trying to reproduce this in pre-prod env. As part of this effort, I was checking whether the socket is open every minute using a simple shell script. I also collect open file and processes running at the failed instance.

Though I'm not exactly reproduce the issue as in prod env, I see opa socket is not listening often. Here is one instance of the failure. Many times the next check works fine and but failures do happen frequently.

Test script

#!/bin/sh

if ! which socat ; then apk add socat; fi

function testsocket
{
    socket=$(find /run/docker/plugins/ -name "*.sock" | grep opa)
    socat -u OPEN:/dev/null UNIX-CONNECT:${socket}
    EXIT_CODE=$?
    if [ ${EXIT_CODE} -eq 0 ];
    then
        echo "$(date): Connection to Socket successful"
    else
        echo "$(date): Connection to Socket FAILED"
        echo "Open files"
        lsof | grep opa
        echo "Running processes"
        ps -ef
    fi
}

output_file=/tmp/testsocket.log
set -x
docker -H unix:///var/run/dind/docker.sock plugin ls | tee ${output_file}
docker -H unix:///var/run/dind/docker.sock plugin inspect openpolicyagent/opa-docker-authz-v2:0.7 | tee -a ${output_file}
set +x
while true
do
    testsocket | tee -a ${output_file}
    sleep 1
done

Failure

Tue Sep 22 21:28:33 UTC 2020: Connection to Socket successful
Tue Sep 22 21:28:34 UTC 2020: Connection to Socket successful
2020/09/22 21:28:35 socat[28132] E exiting on signal 11
Tue Sep 22 21:28:35 UTC 2020: Connection to Socket FAILED
Open files
219	/opa-docker-authz	/dev/null
219	/opa-docker-authz	pipe:[208415330]
219	/opa-docker-authz	pipe:[208415331]
219	/opa-docker-authz	anon_inode:[eventpoll]
219	/opa-docker-authz	pipe:[208410360]
219	/opa-docker-authz	pipe:[208410360]
219	/opa-docker-authz	socket:[208410361]
219	/opa-docker-authz	socket:[208482551]
Running processes
PID   USER     TIME  COMMAND
    1 root     13:45 dockerd --storage-driver=overlay2 -H unix:///var/run/dind/docker.sock
   24 root      0:10 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
  201 root      0:00 containerd-shim -namespace plugins.moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/plugins.moby/2946790b93416011fcf7eed801b307afbea481a8d3992b6a538e91ede4bf96e8 -address /var/run/docker/containerd/containerd.sock -containerd-binary /usr/local/bin/containerd -runtime-root /run/docker/plugins/runtime-root
  219 root      0:11 /opa-docker-authz -policy-file /opa/policies/authz.rego
 5276 root      0:00 sh
 9215 root      0:00 sh
19177 root      0:00 sh
21420 root      0:00 {test-socket.sh} /bin/sh ./test-socket.sh
22230 root      0:00 tail -f /tmp/testsocket.log
28127 root      0:00 {test-socket.sh} /bin/sh ./test-socket.sh
28128 root      0:00 tee -a /tmp/testsocket.log
28136 root      0:00 ps -ef
Tue Sep 22 21:28:36 UTC 2020: Connection to Socket successful
Tue Sep 22 21:28:37 UTC 2020: Connection to Socket successful

Full log file is attached: testsocket.log

@ashutosh-narkar
Copy link
Member

Hmm you're getting a segmentation fault. What system are you running this on ?

@ramapalani
Copy link
Author

We run docker DIND (docker in docker) container as a Kuberenetes daemonset. This is the image docker:18.09.5-dind. OPA docker plugin is installed into this instance of docker.

@ramapalani
Copy link
Author

@ashutosh-narkar I couldn't reproduce this issue in pre-prod environment, but we encounter this in production environment (with higher traffic) consistently after a short period.

So I created a custom plugin, to prevent privileged container creation and that works well.

@ashutosh-narkar
Copy link
Member

That's great ! Is that custom plugin using OPA ?

@ramapalani
Copy link
Author

No, created a fresh docker authorization plugin totally separate from OPA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants