Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot apply csi driver on cis-1.23 hardened rke2 #517

Open
jhoelzel opened this issue Jul 6, 2023 · 8 comments
Open

Cannot apply csi driver on cis-1.23 hardened rke2 #517

jhoelzel opened this issue Jul 6, 2023 · 8 comments

Comments

@jhoelzel
Copy link

jhoelzel commented Jul 6, 2023

Hey there,
Im unable to install the csi driver on a hardneded rke2 cluster that is following selinux and csi guidelines.
My goal is to use the wunderfull block storage of do instead of the space on the nodes.
The process seems to be stopping and the initialization container directly, because there is no way for the container to stat '/etc/udev/rules.d/99-digitalocean-automount.rules' which is likely because of selinux.

What did you do? (required. The issue will be closed when not provided.)

kubectl apply -f https://raw.githubusercontent.com/digitalocean/csi-digitalocean/master/deploy/kubernetes/releases/csi-digitalocean-v4.6.1/{crds.yaml,driver.yaml,snapshot-controller.yaml}

What did you expect to happen?

CSI driver installing itself on a hardened rke2 cluster which is cis-compliant.
I would like to operate hardened nodes, while also using DO block storage for my dynamic storage solutions.

Configuration (MUST fill this out):

i changed nothing but installted the defaults. I of course added the DO token as a secret to the same namespace.

kubectl logs csi-do-node-76f6z -n kube-system Error from server (BadRequest): container "csi-do-plugin" in pod "csi-do-node-76f6z" is waiting to start: PodInitializing

`kubectl logs csi-do-node-76f6z -n kube-system -c automount-udev-deleter
rm: can't stat '/etc/udev/rules.d/99-digitalocean-automount.rules': Permission denied
  • CSI Version:

v4.6.1

  • Kubernetes Version:

Client Version: v1.25.11+rke2r1
Kustomize Version: v4.5.7
Server Version: v1.25.11+rke2r1

  • Cloud provider/framework version, if applicable (such as Rancher):
    default RKE2
@timoreimann
Copy link
Collaborator

Hi 👋

I also think this is probably connected to SELinux and how it affects the default permission schema. My experience with SELinux is limited, a quick Google though yielded https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#assign-selinux-labels-to-a-container which I wonder might be necessary?

@jhoelzel
Copy link
Author

jhoelzel commented Jul 6, 2023

Thank you for your quick help! this lead me to analyze the driver install further where i found this comment:

        # Delete automount udev rule running on all DO droplets. The rule mounts
        # devices briefly and may conflict with CSI-managed droplets (leading to
        # "resource busy" errors). We can safely delete it in DOKS.

Therefore i simply deleted the file in my provisioning base image and reapplied it to the cluster and everything works. On a hardened cluster this is definitely not an allowed action by a container. =)

So if anyone finds this with the same problem:

  • log into your droplet
  • sudo rm -f /etc/udev/rules.d/99-digitalocean-automount.rules

@jhoelzel jhoelzel closed this as completed Jul 6, 2023
@jhoelzel jhoelzel reopened this Aug 24, 2023
@jhoelzel
Copy link
Author

Im reopening this as its not solved.

Digitalocean will replace this file on every boot and henceforth after every reboot you will have to login to delete it again.

@timoreimann
Copy link
Collaborator

@jhoelzel I wonder if, as a workaround to the host mount, you could inject something through cloud-init / user data that deletes the udev file as well. That way, you don't need to do it via the CSI pod.

Another alternative could be to bake similar logic into a custom droplet / VM base image. Probably not great from a maintenance point though.

This may be best solved at the storage / droplet layer, perhaps through a new API field to suppress the udev addition. Feel free to file a support ticket with DO for that as I think it's not fully in scope of what this repo should be tracking.

@jhoelzel
Copy link
Author

I tired the SELinux policy change already but the problem is that my filesystem is read only after cloud init and no matter what i do there seems to be no way to remove it without using the cli myself.

On ubuntu you would simply use my service here:

[Unit]
Description=Delete DigitalOcean udev rule
After=cloud-final.service
ConditionPathExists=/etc/udev/rules.d/99-digitalocean-automount.rules

[Service]
Type=oneshot
ExecStart=/bin/rm -f /etc/udev/rules.d/99-digitalocean-automount.rules
RemainAfterExit=yes
User=root
Group=root
UMask=0077
ProtectSystem=full
ProtectHome=yes
PrivateTmp=yes
NoNewPrivileges=yes

[Install]
WantedBy=cloud-init.target

IMHO the cloud init does need a flag but im pretty sure thats going to take very long to implement on DO side and i will have moved the 100 nodes to a different hoster at this point :/

@jhoelzel
Copy link
Author

jhoelzel commented Aug 24, 2023

soooo i used the hammer

sudo rm -f -r /etc/udev/rules.d

this makes the scripts conditional ignore your system as it thinks you dont have rules.d

the obvious downisde is that all other rules need to be added by hand, but this will solve your troubles.

For my MicroOS friends:
put it in a transtactional update.

Im rebuilding the cluster now, but i would still consider this a bug for DO

Reference in the DO cloud init:

if [ -d /etc/udev/rules.d ]; then
    # Add udev rules to automount block storage volumes.

DO code with the issue:
https://github.com/Shdoyle/COSC419RealBackup/blob/master/lib/cloud/instances/110825973/vendor-data.txt

@timoreimann
Copy link
Collaborator

@jhoelzel what I meant was that when you create the droplets, you could extend the user data portion to have a script that deletes the udev file. I think the droplet API offers a way to interact with cloud init in this fashion.

I believe this would be independent of any addressing that DO drove.

@jhoelzel
Copy link
Author

So deleting the directory did not work as other services will recreate it and you were right all along.

the only real way to reslove it is cloud-init in combination with bootcmd:

user_data         = <<-EOT
    #cloud-config

    bootcmd:
      - [ rm, -f, /etc/udev/rules.d/99-digitalocean-automount.rules ]
  EOT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants