New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage_io_error while deploying cluster #75
Comments
Do you mean it works sometimes and sometimes not? |
Do you know what this could be @glommer ? |
The first node of a 3 Node Scylla cluster will spin up and be in a Ready state,
but will continue to spam the storage_io_error
While writing this reply and looking at simple-cluster-statefulset-1 logs to post here it seems that it is struggling with malformed sstables now and not storage io error like the previous run I did. simple-cluster-statefulset-1
Interestingly, on my secondary testing cluster which has the same setup [Rook-Ceph as storage backend, kubernetes 1.18.0, Ubuntu 18.04.4 LTS] (just smaller) everything works fine. |
Can you test with |
so running with developerMode: false in the cluster.yaml I get the following logs:
|
That looks completely broken now if the commitlog is not found. Can you recreate the cluster from scratch? |
Can you check the logs of the rook agent? There is an issue that could perhaps explain it but the logs needs to be checked to be sure. |
From the top of my head I don't know what in the commitlog could return EINTR. Maybe your fellow swedeman Calle knows? However I'll just as a hunch say that maybe this is because the storage is not on XFS. I don't know if it is or it isn't, but my bet is that it isn't, and this is some fs-specific behavior. We very seldom test out of XFS. |
My rook ceph deployment doesn't make use of agents. As for if I can recreate the cluster, that would be the very last thing I'd try once all my other options are exhausted.
I set up rook-ceph as our storage backend to deploy the osd's on drives or partitions formatted in xfs
I'm also investigating if the root cause of my issue might a wonky rook-ceph cluster after my kubernetes upgrades. I initially dismissed that thought because other stores like minio and elasticsearch, which uses it, are working fine but scylla might be more sensitive to issue with rook-ceph. |
So after tearing down and recreating a stable rook-ceph cluster I redeployed the latest scylla-operator
I only get the storage_io_error when using scylla-operator version v0.1.2
|
@MicDeDuiwel I was having a similar issue starting a Scylla Cluster on AKS where the I switched to using Scylla 3.3.0 and was able to successfully deploy a Scylla cluster. Note sure if that is an option for you but it might be worth trying. |
That is very strange can you reproduce this at will? Does it always fail on 3.2.1 @cmball1 ? |
@cmball1 Thanks, I now get a different error which leads me to believe that the issue I'm facing might be the way scylla interprets the ceph filesysystem:
|
Scylla 3.2 docker images may hit a known issue scylladb/scylladb#5638 that is only fixed in Scylla 3.3 docker images (we work around an issue of not having enough resources set) |
Update on my progress.
Set my rook cluster crd to fstype: xfs according to this link: https://kubernetes.io/docs/concepts/storage/storage-classes/#ceph-rbd I consistently manage to get one or two nodes running when using the latest 4.0.rc1, any more nodes are a struggle. I did find that when redeploying cluster, it is best to redeploy the operator as well.
Might be a 4.0.rc1 thing but the nodes will not deploy unless they are in developer mode. logs from the final node of a 3 node cluster:
The operator repeats these logs:
Also interesting to note that I can't scale up the scylla cluster after deployment as indicated by the docs:
Also, even on the running nodes I can't seem to access cqlsh:
Finally, before the final node reboots itself it will throw a bunch of errors regarding malformed sstables:
|
@MicDeDuiwel looks like ceph has bad interaction with asynchronous I/O. @tchaikov, do you happen to know of any problems with aio/dio and ceph? We see EINTR and maybe data corruption. |
Not sure if relevant, but Ceph seems to have issues with XFS support on top of RBDs: |
It does look relevant. |
@yanniszark @avikivity it looks like an issue on nbd (rbd) side. so it's a bit out of my expertise. i am more focused on the underlying RADOS cluster. rbd is a client of RADOS cluster. but the symptom described in rook/rook#3132 was high CPU usage caused by dead lock in kernel where the rbd client is colocated with OSD daemon. but i cannot find the evidence of either of them in the description above. is scylla colocated with the osd? is high CPU usage observed? also, probably a little bit off-topic, shall scylla (or seastar) retry the syscall when |
Good idea re EINTR. Most filesystems don't return it, but NFS soft mounts can and we see that so does ceph. But how much should we retry? At some point we should give up. Is there documentation somewhere on when we can expect EINTR? I don't want to retry EINTR if I get is as a response to a signal. |
@dahankzter Yes, I tried deploying a cluster several times and each time same issue with the |
An Update from our side: So after a lot of digging and reaching out we came to the realisation that our rook deployment must be at fault. We had many issues with rook and cephfs and decided to switch to ceph block storage. This solved our issue, scylla deployments are working just fine now. |
not sure. i need to consult rbd developers on this. |
@dahankzter can we prevent/warn this kind of deployment? At least until we can figure out how we can support it. @MicDeDuiwel note that you'll be experiencing double replication and poor performance running on top of ceph, since each of Scylla's replicas will be replicated by ceph. So if each layer has a replication factor of 3, you end up with an overall replication factor of 9. The recommendation is to work with local volumes. |
Afaik it's very hard to detect this @avikivity we could perhaps try to detect the presence of essential directories but that doesn't really help since it's pretty late in the cycle. Deployment is well under way once this is possible and it's already done by Scylla no? |
@dahankzter I opened this issue yesterday and it looks somewhat similar. I have a reproducer: |
@MicDeDuiwel did you have to enable developer mode for Scylla to work on top of Ceph block storage? I'm trying to do the same thing but io setup always fails. |
@akhilles yes we deployed Scylla-operator with the developer mode enabled. |
Describe the bug
Hi, I'm having issues getting my scylla cluster to run on my kubernetes 1.18.0 cluster using rook-ceph as the storage backend.
I keep getting this error when the first node starts up:
Or sometimes when the first node starts up the second node will throw the same error.
I set up some nodeSelectors to deploy the scylla cluster on nodes that I'm certain don't have disk issues.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expect the scylla pods to deploy one by one.
Logs
Environment:
Additional context
I had updated the cluster from 1.15 -> 1.16 -> 1.17 -> 1.18 when it first started happening, but on a second test cluster which I also performed the same Kubernetes upgrade procedure scylla was working fine.
The text was updated successfully, but these errors were encountered: