New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when emitting trace during a snapshot operation (7.3.27) #11308
Comments
I am not familiar with this feature. However, from the log it seems the roles are "Roles: CS,DD,MS,RK", which are all stateless and shouldn't create disk snapshot data. "CS" is a new ConsistencyScan role, so it might be that snapshot was erroneously not excluding "CS" role. |
Thanks @jzhou77, it was indeed a stateless process affected. Is there a way to get a better Also: is this CS role something I can disable, to verify the hypothesis? |
Another idea: perhaps we could make the |
* Cherry pick #11308 Raise visibility of gray failure actions * format change --------- Co-authored-by: Dan Lambright <hlambright@apple.com>
I have been looking into this using logs; I suspect that this commit 29f98f3 might be involved in the problem. The crash seems to be in an unspecified code line while trying to create requests for each of the stateful workers. @sfc-gh-clin do you have any insight into this? |
Yeah, I just looked at |
Thanks @sfc-gh-clin! Any plans to backport this fix to a stable release? I am going to cherry pick the commit until then. |
No problem. Do you mind doing it yourself or maybe @jzhou77 can do it? |
NP. I created #11341 to fix this. |
Just noticed the follow-up, thanks! I am unfamiliar with the release process, will this at some point land on a new 7.3.x tagged release, marked as pre-release? In such case I think I will stick to 7.3.27 + cherry-pick to avoid potential issues. |
Yes. This fix will be included in the next 7.3 release (marked as pre-release), so I will close this issue for now. |
When using the snapshot command, I can reliably reproduce a crash with FoundationDB
7.3.27
.This crash cannot be reproduced with
7.1.15
.The corresponding server-side trace:
Symbolizing leads to this output (I had to Ctrl+C because it was hanging):
The text was updated successfully, but these errors were encountered: