New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'sudo: simple-file-writer: command not found' when resizing #390
Comments
Tried rolling the image tag from v1.9.0 > v1.8.4, and that resulted in a different error..
Not sure what is wrong here but any advice would be helpful.. I have workloads that cannot schedule until the PVC is done resizing and have had a postgres instance down for 24 hours now, cannot figure out how to get the resize to finish. The volumes have already had quota changes applied on the Truenas side. |
Actually, after chasing this down for a bit, i believe this is related to #295 Related, but not the same - as I do not use Here is my full iSCSI config controller:
externalAttacher:
resources:
limits:
cpu: 50m
memory: 50Mi
requests:
cpu: 50m
memory: 50Mi
externalProvisioner:
resources:
limits:
cpu: 50m
memory: 50Mi
requests:
cpu: 50m
memory: 50Mi
externalSnapshotter:
resources:
limits:
cpu: 50m
memory: 30Mi
requests:
cpu: 50m
memory: 30Mi
externalResizer:
resources:
limits:
cpu: 50m
memory: 50Mi
requests:
cpu: 50m
memory: 50Mi
driver:
image: ghcr.io/democratic-csi/democratic-csi:v1.9.0
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 200m
memory: 200Mi
node:
driver:
image: ghcr.io/democratic-csi/democratic-csi:v1.9.0
resources:
limits:
cpu: 200m
memory: 128Mi
requests:
cpu: 200m
memory: 128Mi
csiDriver:
# should be globally unique for a given cluster
name: "org.democratic-csi.iscsi"
# add note here about volume expansion requirements
storageClasses:
- name: freenas-iscsi-csi
defaultClass: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
parameters:
# for block-based storage can be ext3, ext4, xfs
# for nfs should be nfs
fsType: ext4
# if true, volumes created from other snapshots will be
# zfs send/received instead of zfs cloned
# detachedVolumesFromSnapshots: "false"
# if true, volumes created from other volumes will be
# zfs send/received instead of zfs cloned
# detachedVolumesFromVolumes: "false"
mountOptions: []
secrets:
provisioner-secret:
controller-publish-secret:
node-stage-secret:
# # any arbitrary iscsiadm entries can be add by creating keys starting with node-db.<entry.name>
# # if doing CHAP
# node-db.node.session.auth.authmethod: CHAP
# node-db.node.session.auth.username: foo
# node-db.node.session.auth.password: bar
#
# # if doing mutual CHAP
# node-db.node.session.auth.username_in: baz
# node-db.node.session.auth.password_in: bar
node-publish-secret:
controller-expand-secret:
# if your cluster supports snapshots you may enable below
volumeSnapshotClasses: []
#- name: freenas-iscsi-csi
# parameters:
# # if true, snapshots will be created with zfs send/receive
# # detachedSnapshots: "false"
# secrets:
# snapshotter-secret:
driver:
config:
# please see the most up-to-date example of the corresponding config here:
# https://github.com/democratic-csi/democratic-csi/tree/master/examples
# YOU MUST COPY THE DATA HERE INLINE!
driver: ${driver_name}
instance_id:
httpConnection:
protocol: https
host: ${hostname}
port: 8443
apiKey: ${api_key}
allowInsecure: false
apiVersion: 2
sshConnection:
host: ${hostname}
port: 22
username: ${username}
privateKey: |
${indent(8, ssh_priv_key)}
zfs:
# the example below is useful for TrueNAS 12
cli:
sudoEnabled: true
paths:
zfs: /sbin/zfs
zpool: /sbin/zpool
sudo: /bin/sudo
chroot: /sbin/chroot
# total volume name (zvol/<datasetParentName>/<pvc name>) length cannot exceed 63 chars
# https://www.ixsystems.com/documentation/freenas/11.2-U5/storage.html#zfs-zvol-config-opts-tab
# standard volume naming overhead is 46 chars
# datasetParentName should therefore be 17 chars or less
datasetParentName: ssd/block/v
detachedSnapshotsDatasetParentName: ssd/block/s
datasetEnableQuotas: true
datasetEnableReservation: false
# "" (inherit), lz4, gzip-9, etc
zvolCompression:
# "" (inherit), on, off, verify
zvolDedup:
zvolEnableReservation: false
# 512, 1K, 2K, 4K, 8K, 16K, 64K, 128K default is 16K
zvolBlocksize:
iscsi:
targetPortal: "${hostname}:${portal_port}"
targetPortals: []
# leave empty to omit usage of -I with iscsiadm
interface:
namePrefix: csi-
nameSuffix: "-cluster"
# add as many as needed
targetGroups:
# get the correct ID from the "portal" section in the UI
- targetGroupPortalGroup: 1
# get the correct ID from the "initiators" section in the UI
targetGroupInitiatorGroup: 1
# None, CHAP, or CHAP Mutual
targetGroupAuthType: None
# get the correct ID from the "Authorized Access" section of the UI
# only required if using Chap
targetGroupAuthGroup:
extentInsecureTpc: true
extentXenCompat: false
extentDisablePhysicalBlocksize: true
# 512, 1024, 2048, or 4096,
extentBlocksize: 4096
# "" (let FreeNAS decide, currently defaults to SSD), Unknown, SSD, 5400, 7200, 10000, 15000
extentRpm: "SSD"
# 0-100 (0 == ignore)
extentAvailThreshold: 0
|
I've rolled forward to @travisghansen if you have any ideas on how i can get the resize unstuck for now, that would be appreciated. |
Well, the command exists in the image and is in the path, I'm confused why the CSI cannot find it during execution... -> % docker run -it --entrypoint bash ghcr.io/democratic-csi/democratic-csi:v1.9.0
root@9ceb6701b729:/home/csi/app# ls
bin csi_proto csi_proxy_proto LICENSE node_modules package.json package-lock.json src
root@9ceb6701b729:/home/csi/app# find / -type f -name "simple-file-writer"
/usr/local/bin/simple-file-writer
root@9ceb6701b729:/home/csi/app# echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
root@9ceb6701b729:/home/csi/app# simple-file-writer
/usr/local/bin/simple-file-writer: line 3: ${2}: ambiguous redirect
root@9ceb6701b729:/home/csi/app# cat /usr/local/bin/simple-file-writer
#!/bin/bash
echo ${1} > ${2}
root@9ceb6701b729:/home/csi/app# |
I have a feeling this democratic-csi/src/driver/freenas/ssh.js Line 2057 in a6dec24
|
Ok, took a few iterations of building an image variant to confirm since I can't build the container due to not being able to download objectivefs but: {"host":"truenas-iscsi-democratic-csi-controller-5f4b6cfd5d-vjsrk","level":"error","message":"handler error - driver: FreeNASSshDriver method: ControllerExpandVolume error: {\"name\":\"GrpcError\",\"code\":2,\"message\":\"error reloading iscsi daemon: {\\\"stderr\\\":\\\"sudo: /usr/local/bin/simple-file-writer: command not found\\\\n\\\",\\\"code\\\":1}\"}","service":"democratic-csi","timestamp":"2024-04-26T21:36:25.603Z"}
It's not a path problem - same error with the full path. I think this is being executed on the FreeNAS host? Another issue I hit, while looking at this: My cert expired on the TrueNAS and I started getting
So the error message I posted showing the full path looks weird because to find out the cert expired, I had to modify this: throw new GrpcError(
grpc.status.FAILED_PRECONDITION,
`TrueNAS api is unavailable: ${err.getMessage()}`
); to throw new GrpcError(
grpc.status.FAILED_PRECONDITION,
`TrueNAS api is unavailable: ${err}`
); That allowed me to see the cert error - seems like whatever throws the cert error isn't building the error object the same way. |
So I noticed the TODO and realized the if statement causing this and shell script aren't needed if you double quote the echo statement for the sudo call after playing around in the TrueNAS shell. csi@truenas01:~$ sudo echo 1 > /sys/kernel/scst_tgt/devices/csi-pvc-bf920f1b-9270-437c-9193-8724cf1eee24-cluster/resync_size
-bash: /sys/kernel/scst_tgt/devices/csi-pvc-bf920f1b-9270-437c-9193-8724cf1eee24-cluster/resync_size: Permission denied
csi@truenas01:~$ sudo sh -c echo 1 > /sys/kernel/scst_tgt/devices/csi-pvc-bf920f1b-9270-437c-9193-8724cf1eee24-cluster/resync_size
-bash: /sys/kernel/scst_tgt/devices/csi-pvc-bf920f1b-9270-437c-9193-8724cf1eee24-cluster/resync_size: Permission denied
csi@truenas01:~$ sudo sh -c "echo 1 > /sys/kernel/scst_tgt/devices/csi-pvc-bf920f1b-9270-437c-9193-8724cf1eee24-cluster/resync_size"
So I modified your original code from if (process.env.DEMOCRATIC_CSI_IS_CONTAINER == "true") {
// use the built-in wrapper script that works with sudo
command = execClient.buildCommand("simple-file-writer", [
"1",
`/sys/kernel/scst_tgt/devices/${kName}/resync_size`,
]);
} else {
// TODO: syntax fails with sudo
command = execClient.buildCommand("sh", [
"-c",
`echo 1 > /sys/kernel/scst_tgt/devices/${kName}/resync_size`,
]);
} To command = execClient.buildCommand("sh", [
"-c",
`"echo 1 > /sys/kernel/scst_tgt/devices/${kName}/resync_size"`,
]); And that has resolved it for me. |
I'm currently running off my patched version I can throw up a PR -however I'm uncertain how to properly test the build with the objectivefs part of the build blocking me from building. |
You are entirely correct about running on the TN machine. Not sure what I was thinking. Is the code you have working with sudo? |
Yes |
I’ll get this incorporated shortly. Thanks for taking the time to sort it out! Good point about objectivefs too, I think I’ll make that more friendly to these kinds of scenarios. |
Yeah if you can fix that mate then I can likely start throwing PRs your way when I hit stuff like this. I have enough JS and Node skills to be of use, and had to hydrate on the codebase a wee bit yesterday. |
I have similar issues but my errors are different, is this the same issue or should I create a new one? Kubernetes event:
Dmesg:
I am coincidentally also blocked on a postgres pod. @Routhinator I have an error running your image: |
That is a different error and not related. And the exec error you get with my image suggests you are not running it on x86 infrastructure. I did not build arm64 images |
Ah I see, it is indeed only my arm node that has the issue. |
That syntax was reported to not work previously. Need to figure what the deal is here.. |
Oh, I see yours has quotes..nevermind :) |
Should be fixed here: 38bee21 Give |
This error seems new. It's now coming up when resizing a volume on the new TrueNas Scale Dragonfish release.. is this executable supposed to be on the TrueNAS side?
Driver
freenas-iscsi-csi
- Democratic CSI Chart 0.14.6TrueNAS Scale Dragonfly Train - Version TrueNAS-SCALE-24.04.0
The text was updated successfully, but these errors were encountered: