Unable to deploy on Ubuntu 18.04 -> pods not found. #635

ghost · 2020-01-23T18:22:21Z

Hello,

im quite new to K8s and rancher but i need some kind of global available storage, therefor i tried out this project but everytime i try to deploy the glusterfs
I get the following error at the dashborad:

Readiness probe failed: /usr/local/bin/status-probe.sh failed check: systemctl -q is-active glusterd.service

and the deployment never went trough sucessefully.

The log output of the pod locks like this:

23.1.2020 19:15:20maximum number of pids configured in cgroups: max
23.1.2020 19:15:20maximum number of pids configured in cgroups (reconfigured): max
23.1.2020 19:15:20env variable is set. Update in gluster-blockd.service

and basically thats it ...

this is what my topology look like:

{
  "clusters": [
    {
      "nodes": [
        {
          "node": {
            "hostnames": {
              "manage": [
                "worker1"
              ],
              "storage": [
                "192.168.40.151"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/vdb"
          ]
        },
        {
          "node": {
            "hostnames": {
              "manage": [
                "worker2"
              ],
              "storage": [
                "192.168.40.152"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/vdb"
          ]
        },
        {
          "node": {
            "hostnames": {
              "manage": [
                "worker3"
              ],
              "storage": [
                "192.168.40.153"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/vdb"
          ]
        },
                {
          "node": {
            "hostnames": {
              "manage": [
                "worker4"
              ],
              "storage": [
                "192.168.40.154"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/vdb"
          ]
        }
      ]
    }
  ]
}

Thats the cmd i exec on the rancher node where kubectl is setup and working:

./gk-deploy -g --user-key MyUserKey --admin-key MyAdminKey --ssh-keyfile /root/.ssh/id_rsa -l /tmp/heketi_deployment.log -v topology.json

this is the complete script output:

Do you wish to proceed with deployment?

[Y]es, [N]o? [Default: Y]: Y
Using Kubernetes CLI.

Checking status of namespace matching 'default':
default Active 142m
Using namespace "default".
Checking glusterd status on 'worker1'.
Checking for pre-existing resources...
GlusterFS pods ...
Checking status of pods matching '--selector=glusterfs=pod':

Timed out waiting for pods matching '--selector=glusterfs=pod'.
not found.
deploy-heketi pod ...
Checking status of pods matching '--selector=deploy-heketi=pod':

Timed out waiting for pods matching '--selector=deploy-heketi=pod'.
not found.
heketi pod ...
Checking status of pods matching '--selector=heketi=pod':

Timed out waiting for pods matching '--selector=heketi=pod'.
not found.
gluster-s3 pod ...
Checking status of pods matching '--selector=glusterfs=s3-pod':

Timed out waiting for pods matching '--selector=glusterfs=s3-pod'.
not found.
Creating initial resources ... /bin/kubectl -n default create -f /root/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml 2>&1
serviceaccount/heketi-service-account created
/bin/kubectl -n default create clusterrolebinding heketi-sa-view --clusterrole=edit --serviceaccount=default:heketi-service-account 2>&1
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view created
/bin/kubectl -n default label --overwrite clusterrolebinding heketi-sa-view glusterfs=heketi-sa-view heketi=sa-view
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view labeled
OK
Marking 'worker1' as a GlusterFS node.
/bin/kubectl -n default label nodes worker1 storagenode=glusterfs --overwrite 2>&1
node/worker1 not labeled
Marking 'worker2' as a GlusterFS node.
/bin/kubectl -n default label nodes worker2 storagenode=glusterfs --overwrite 2>&1
node/worker2 not labeled
Marking 'worker3' as a GlusterFS node.
/bin/kubectl -n default label nodes worker3 storagenode=glusterfs --overwrite 2>&1
node/worker3 not labeled
Marking 'worker4' as a GlusterFS node.
/bin/kubectl -n default label nodes worker4 storagenode=glusterfs --overwrite 2>&1
node/worker4 not labeled
Deploying GlusterFS pods.
sed -e 's/storagenode: glusterfs/storagenode: 'glusterfs'/g' /root/gluster-kubernetes/deploy/kube-templates/glusterfs-daemonset.yaml | /bin/kubectl -n default create -f - 2>&1
daemonset.extensions/glusterfs created
Waiting for GlusterFS pods to start ...
Checking status of pods matching '--selector=glusterfs=pod':
glusterfs-cnjm5 0/1 Running 0 5m11s
glusterfs-nfs6z 0/1 Running 0 5m11s
glusterfs-rvtrf 0/1 Running 0 5m11s
glusterfs-sw2bd 0/1 Running 0 5m11s
Timed out waiting for pods matching '--selector=glusterfs=pod'.
pods not found.

The text was updated successfully, but these errors were encountered:

ghost · 2020-01-23T19:27:07Z

if i remove the glusterfs-server package i get Can't access glusterd on 'worker1'

ghost · 2020-01-24T00:47:35Z

If i take a look at the glusterd.log i always get the message that the port is already in use. That again makes sense if glusterd service is running on the worker node itself outside of k8s. The point is that im not able to use the deployment script if glusterd service is down, this results in the following error:

Checking status of namespace matching 'default':
default Active 6m37s
Using namespace "default".
Checking glusterd status on 'worker1'.
Can't access glusterd on 'worker1'

If glusterd is on i get this at the glusterd.log:

2020-01-24 00:41:20.188487] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:20.188499] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:21.188619] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:21.188660] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:22.188774] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:22.188804] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:23.188923] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:23.188953] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:24.189076] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:24.189115] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:25.189235] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:25.189268] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:26.189376] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:26.189413] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:27.189523] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:27.189585] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:28.189720] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:28.189751] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use
[2020-01-24 00:41:29.189856] E [socket.c:976:__socket_server_bind] 0-socket.management: binding to failed: Address already in use
[2020-01-24 00:41:29.189889] E [socket.c:978:__socket_server_bind] 0-socket.management: Port is already in use

Any idea?

minhnnhat · 2020-02-10T04:13:00Z

hi @venomone, did you solve this, i met the same problem when creating glusterfs pod. Thank

ghost · 2020-02-10T10:40:22Z

The simple answer is no... i tried many Scenarios but none of them worked as expected on ubuntu.
Kindly asking to check and fix this up.

minhnnhat · 2020-02-10T10:48:34Z

The simple answer is no... i tried many Scenarios but none of them worked as expected on ubuntu.
Kindly asking to check and fix this up.

Thank u, I moved to rook. BTW, I found a new project that alternative for heketi, you can try on this http://github.com/kadalu/kadalu

ghettosamson · 2020-06-02T18:52:47Z

I'm running into this same issue on Ubuntu 18.04. Has anyone looked into this? I attempted to use Kadalu but it doesn't serve my needs as I need the ability to define StatefulSets and have the storage service create new PVC for each pod without having to predefine or precreate the PVC's.

ghost changed the title ~~Unable to deploy on Ubuntu 18.04 with Rancher~~ Unable to deploy on Ubuntu 18.04 -> pods not found. Jan 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to deploy on Ubuntu 18.04 -> pods not found. #635

Unable to deploy on Ubuntu 18.04 -> pods not found. #635

ghost commented Jan 23, 2020

ghost commented Jan 23, 2020

ghost commented Jan 24, 2020

minhnnhat commented Feb 10, 2020

ghost commented Feb 10, 2020

minhnnhat commented Feb 10, 2020

ghettosamson commented Jun 2, 2020

Unable to deploy on Ubuntu 18.04 -> pods not found. #635

Unable to deploy on Ubuntu 18.04 -> pods not found. #635

Comments

ghost commented Jan 23, 2020

ghost commented Jan 23, 2020

ghost commented Jan 24, 2020

minhnnhat commented Feb 10, 2020

ghost commented Feb 10, 2020

minhnnhat commented Feb 10, 2020

ghettosamson commented Jun 2, 2020