Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ksync init cannot connect to docker daemon #517

Open
tetchel opened this issue Mar 29, 2021 · 17 comments
Open

ksync init cannot connect to docker daemon #517

tetchel opened this issue Mar 29, 2021 · 17 comments

Comments

@tetchel
Copy link

tetchel commented Mar 29, 2021

hi, I am trying to run ksync init to set up ksync on a CodeReady Containers cluster running locally on my Fedora machine

ksync init failed with this message:

[ /redhat-actions/openshift-actions-connector/containerize ] 47 (main) $ ksync init
==== Preflight checks ====

==== Cluster Environment ====

==== Postflight checks ====
↳	rpc error: code = Unknown desc = Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied
FATA[0001] 

Adding sudo results in the same error message.

dockerd is running and i can run all docker commands as my current user since I am part of the docker group.

Thanks for any help

[ /redhat-actions/openshift-actions-connector/containerize ] 06 (main) $ ksync version
ksync:
	Version:    Release
	Go Version: go1.16.2
	Git Commit: 14ec9e2
	Git Tag:    0.4.7-hotfix
	Built:      Wed Mar 24 22:04:50 +0000 2021
	OS/Arch:    linux/amd64
service:
	Version:    Release
	Go Version: go1.16.2
	Git Commit: 14ec9e2
	Git Tag:    0.4.7-hotfix
	Built:      Wed Mar 24 22:08:03 +0000 2021
[ /redhat-actions/openshift-actions-connector/containerize ] 06 (main) $ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
 Containers: 14
  Running: 0
  Paused: 0
  Stopped: 14
 Images: 616
 Server Version: 20.10.5
 Storage Driver: btrfs
  Build Version: Btrfs v5.10 
  Library Version: 102
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.23-200.fc33.x86_64
 Operating System: Fedora 33 (Workstation Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 31.11GiB
 Name: tims-fedora
 ID: 2TP6:X5QS:VJ36:ZABX:GMIV:J6R3:Q5DW:44JZ:EGSR:MR3G:R3IZ:G42Z
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: Support for cgroup v2 is experimental
@tetchel
Copy link
Author

tetchel commented Mar 29, 2021

I see #289 but it seems closed without resolution

@timfallmk
Copy link
Collaborator

@tetchel Looks like you're using BTRFs as a storage driver. Unfortunately ksync doesn't support that right now.

@tetchel
Copy link
Author

tetchel commented Apr 7, 2021

thanks for the response. I switched to overlay2 storage driver and am still getting the same result.

[ /src/redhat-actions/openshift-actions-connector ] 08 (main) $ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 59
 Server Version: 20.10.5
 Storage Driver: overlay2
  Backing Filesystem: btrfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.23-200.fc33.x86_64
 Operating System: Fedora 33 (Workstation Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 31.11GiB
 Name: tims-fedora
 ID: 2TP6:X5QS:VJ36:ZABX:GMIV:J6R3:Q5DW:44JZ:EGSR:MR3G:R3IZ:G42Z
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: tetchel
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: Support for cgroup v2 is experimental

[ /src/redhat-actions/openshift-actions-connector ] 12 (main) $ sudo ksync init
[sudo] password for tim: 
==== Preflight checks ====

==== Cluster Environment ====

==== Postflight checks ====
↳	rpc error: code = Unknown desc = Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied
FATA[0001]        

@timfallmk
Copy link
Collaborator

That would indicate the local docker daemon isn't available. Is there some special configuration for redhat when running the daemon? Maybe check your local docker client setup to see how it connects.

@tetchel
Copy link
Author

tetchel commented Apr 8, 2021

I am just on a regular Fedora distribution. Docker is definitely running as I can build images, run docker ps, etc.

/run/docker.sock is there as expected, I'm not sure what else I would check.

@timfallmk
Copy link
Collaborator

Well that would do it 😄 . The trace above shows it’s looking for the socket at /var/run/docker.sock, which is the default for most platforms. I vaguely remember RedHat doing something different.

There’s a ksync config option to give a path to the Docker socket for this reason. I can’t look it up at the moment, as I’m not at my machine, but it should™️ be in the documentation.

@tetchel
Copy link
Author

tetchel commented Apr 8, 2021

sorry, my mistake, it's in both places

[ /src/redhat-actions/openshift-actions-connector ] 21 (main) $ ls -l /var/run/docker.sock
srw-rw----. 1 root docker 0 Apr  2 16:49 /var/run/docker.sock
[ /src/redhat-actions/openshift-actions-connector ] 21 (main) $ sudo ls -l /run/docker.sock
[sudo] password for tim: 
srw-rw----. 1 root docker 0 Apr  2 16:49 /run/docker.sock
[ /src/redhat-actions/openshift-actions-connector ] 22 (main) $

@timfallmk
Copy link
Collaborator

Hmm. There’s still an itch in the back of my brain about RH/Fedora releases doing something different here. Btw, I’m not sure it’s actually documented (oops), but here’s the relevant flag

flags.String(

SELinux issue @grampelberg ?

@timfallmk
Copy link
Collaborator

timfallmk commented Apr 8, 2021

Two things to try:

  1. If you haven’t, run the doctor command.
  2. Try both that and init as root and see if there’s a difference

@tetchel
Copy link
Author

tetchel commented Apr 15, 2021

It has the same problem

[ /tim/downloads/crc-linux-1.25.0-amd64 ] 53 $ ksync doctor
↳	rpc error: code = Unknown desc = Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied
[ /tim/downloads/crc-linux-1.25.0-amd64 ] 53 $ sudo ksync doctor
↳	rpc error: code = Unknown desc = Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied

I tried some other stuff like sudo -i and chmod 666 /var/run/docker.sock and still somehow get 'permission denied'

w/ debug output:

[ /tim/downloads/crc-linux-1.25.0-amd64 ] 00 $ sudo ksync doctor --log-level=debug
Extra Binaries                              ✓
DEBU[0000] initializing kubernetes client                context=
DEBU[0000] kubernetes client created                     host="https://api.crc.testing:6443"
Cluster Config                              ✓
Cluster Connection                          ✓
Cluster Version                             ✓
Cluster Permissions                         ✓
Cluster Service                             ✓
DEBU[0000] radar nodes                                   count=1
DEBU[0000] checking to see if radar is ready             nodeName=crc-xl2km-master-0
DEBU[0000] found pod name                                Namespace=kube-system RadarPort=40321 SyncthingAPI=8384 SyncthingListener=22000 nodeName=crc-xl2km-master-0 podName=ksync-nfpmr
DEBU[0000] found pod                                     nodeName=crc-xl2km-master-0 podName=ksync-nfpmr status=Running
Service Health                              ✓
DEBU[0000] radar nodes                                   count=1
DEBU[0000] checking to see if radar is ready             nodeName=crc-xl2km-master-0
DEBU[0000] found pod name                                Namespace=kube-system RadarPort=40321 SyncthingAPI=8384 SyncthingListener=22000 nodeName=crc-xl2km-master-0 podName=ksync-nfpmr
DEBU[0000] found pod                                     nodeName=crc-xl2km-master-0 podName=ksync-nfpmr status=Running
DEBU[0000] checking to see if radar is ready             nodeName=crc-xl2km-master-0
DEBU[0000] found pod name                                Namespace=kube-system RadarPort=40321 SyncthingAPI=8384 SyncthingListener=22000 nodeName=crc-xl2km-master-0 podName=ksync-nfpmr
DEBU[0000] found pod                                     nodeName=crc-xl2km-master-0 podName=ksync-nfpmr status=Running
DEBU[0000] starting tunnel                               LocalPort=46315 Namespace=kube-system Out= PodName=ksync-nfpmr RemotePort=40321 url="https://api.crc.testing:6443/api/v1/namespaces/kube-system/pods/ksync-nfpmr/portforward"
DEBU[0000] tunnel running                                LocalPort=46315 Namespace=kube-system Out="Forwarding from 127.0.0.1:46315 -> 40321\nForwarding from [::1]:46315 -> 40321\n" PodName=ksync-nfpmr RemotePort=40321
Service Version                             ✓
DEBU[0000] radar nodes                                   count=1
DEBU[0000] checking to see if radar is ready             nodeName=crc-xl2km-master-0
DEBU[0000] found pod name                                Namespace=kube-system RadarPort=40321 SyncthingAPI=8384 SyncthingListener=22000 nodeName=crc-xl2km-master-0 podName=ksync-nfpmr
DEBU[0000] found pod                                     nodeName=crc-xl2km-master-0 podName=ksync-nfpmr status=Running
DEBU[0000] starting tunnel                               LocalPort=45559 Namespace=kube-system Out= PodName=ksync-nfpmr RemotePort=40321 url="https://api.crc.testing:6443/api/v1/namespaces/kube-system/pods/ksync-nfpmr/portforward"
DEBU[0000] tunnel running                                LocalPort=45559 Namespace=kube-system Out="Forwarding from 127.0.0.1:45559 -> 40321\nForwarding from [::1]:45559 -> 40321\n" PodName=ksync-nfpmr RemotePort=40321
Docker Version                              ✘
↳	rpc error: code = Unknown desc = Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied

@timfallmk
Copy link
Collaborator

I'm just shooting in the dark now, but does your docker startup config (wherever that is on Fedora) have anything in it that would be a non-standard setting? It sounds like there's something Fedora does that's non-standard but I don't know what.

@timfallmk
Copy link
Collaborator

timfallmk commented Apr 15, 2021

@zawias-pro
Copy link

zawias-pro commented May 26, 2021

I have the same issue on my Linux mint. (Ubuntu derivative)

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
  scan: Docker Scan (Docker Inc., v0.7.0)

Server:
 Containers: 3
  Running: 3
  Paused: 0
  Stopped: 0
 Images: 99
 Server Version: 20.10.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-73-generic
 Operating System: Linux Mint 20.1
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.13GiB
 Name: ***
 ID: ***
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: ***
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

@mrtimp
Copy link

mrtimp commented Jul 16, 2021

I'm not sure if this helps or not but I ran into a similar issue - not 100% the same as the one described here (maybe closer to #289). I'm currently using microk8s on macOS Big Sur (microk8s is installed/configured/managed via multipass and is a VirtualBox VM). After a bit of debugging it turned out that the rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? error that I was getting was because the ksync pod couldn't access the Docker socket - which makes total sense because in my situation Docker is running on macOS not inside the VirtualBox VM.

What I did to work around this until I can come up with a better solution was run two socat instances, the first on macOS exposing /var/run/docker.sock as a TCP service:

socat -d TCP-LISTEN:17443,reuseaddr,fork UNIX-CLIENT:/var/run/docker.sock

then another inside the VM connecting to the TCP service on macOS and recreating the socket in /var/run/docker.sock:

sudo socat -d UNIX-LISTEN:/var/run/docker.sock,reuseaddr,fork TCP:[macOS host IP]:17443

This is a total hack for now but I just needed to determine where the issue was.

I first attempted to share /var/run/docker.sock into the VM but due to it being a unix socket it looses access to the kernel inside the VM and becomes a file - socat to the rescue ;-)

@timfallmk
Copy link
Collaborator

timfallmk commented Jul 30, 2021 via email

@kaiivoschneider
Copy link

Alright it's been a year. Where is the support.

@timfallmk
Copy link
Collaborator

@kaiivoschneider Was there something in particular you were looking for. In general this project is largely defunct and both @grampelberg and myself have moved on to other things. I do my best to answer what I can, but there won't be any significant work on the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants