Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atomix-consensus-node permission denied at pod start writing to data with persistent volume block storage on Google Cloud #1114

Open
jallenkj opened this issue Mar 29, 2023 · 0 comments

Comments

@jallenkj
Copy link

Deploying Aether Central in Google Cloud, configuring Atomix to use Google persistent block storage:

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2023-03-28T12:28:32Z"
  name: silver
  resourceVersion: "10306956"
  uid: 7087fc2f-b765-4f02-bd73-7966ba4cfce7
parameters:
  fstype: ext4
  replication-type: none
  type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: Immediate

Pod is failing to start because process does not have write permissions to /var/lib/atomix directory:

++ hostname
+ [[ onos-consensus-store-0 =~ -([0-9]+)$ ]]
+ ordinal=0
+ atomix-consensus-node --config /etc/atomix/raft.yaml --api-port 5678 --raft-host onos-consensus-store-0.onos-consensus-store-hs.aether-roc.svc.cluster.local --raft-port 5679
2023-03-28 16:45:57.142110 I | dragonboat: go version: go1.19.4, linux/amd64
2023-03-28 16:45:57.142165 I | dragonboat: dragonboat version: 3.3.5 (Rel)
2023-03-28 16:45:57.142201 I | config: using default EngineConfig
2023-03-28 16:45:57.142212 I | config: using default LogDBConfig
2023-03-28 16:45:57.142262 I | dragonboat: DeploymentID set to 1
panic: mkdir /var/lib/atomix/data: permission denied

goroutine 1 [running]:
github.com/atomix/consensus-storage/node/pkg/consensus.NewProtocol({0x0, 0x0, 0x0, 0x0, 0x0}, 0xc00028cfc0, {0xc0001d7cb8, 0x2, 0x0?})
        github.com/atomix/consensus-storage/node/pkg/consensus/protocol.go:57 +0x476
main.main.func1(0xc0002ea000?, {0x12d13eb?, 0x8?, 0x8?})
        github.com/atomix/consensus-storage/node/cmd/atomix-consensus-node/main.go:83 +0x71a
github.com/spf13/cobra.(*Command).execute(0xc0002ea000, {0xc0000400a0, 0x8, 0x8})
        github.com/spf13/cobra@v1.4.0/command.go:860 +0x663
github.com/spf13/cobra.(*Command).ExecuteC(0xc0002ea000)
        github.com/spf13/cobra@v1.4.0/command.go:974 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.4.0/command.go:902
main.main()
        github.com/atomix/consensus-storage/node/cmd/atomix-consensus-node/main.go:157 +0x24f

Have seen some workarounds suggesting manually chowning the volumes outside atomix but looking for a solution without manual intervention. Tried setting securityContext but doesn't seem to propagate to the onos-consensus-store pods:

global:
  atomix:
    storage:
      controller: "atomix-controller.kube-system:5679"
    store:
      consensus:
        enabled: true
        clusters: 1
        replicas: 3
        partitions: 3
        log:
          storageClass: "silver"
          resources:
            requests:
              storage: 25Gi
        securityContext:
          runAsUser: 100
          runAsGroup: 101
          fsGroup: 101
          fsGroupChangePolicy: OnRootMismatch

Looking for advice on how to to get this working w/ out manual intervention, or where the appropriate place is to provide the securityContext details to get the filesystem chowned before the atomix-consensus-node needs to write it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant