Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguish criu dump fail from criu dump crash in zdtm #350

Open
xemul opened this issue Jun 30, 2017 · 7 comments · May be fixed by #2376
Open

Distinguish criu dump fail from criu dump crash in zdtm #350

xemul opened this issue Jun 30, 2017 · 7 comments · May be fixed by #2376
Labels

Comments

@xemul
Copy link
Member

xemul commented Jun 30, 2017

In zdtm .desc files there can be a flag called "crfail". It means, that the dump of this test should fail, we know it and are willing to fix. Unfortunately, the zdtm.py script doesn't tell criu dump command failure from crash (for sigsegv, for example). While the former result is what's expected, the latter one should be reported as test failure.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@github-actions
Copy link

github-actions bot commented Apr 3, 2021

A friendly reminder that this issue had no activity for 30 days.

@github-actions github-actions bot closed this as completed Apr 3, 2022
@mihalicyn mihalicyn added no-auto-close Don't auto-close as a stale issue and removed stale-issue labels Mar 13, 2023
@mihalicyn mihalicyn reopened this Mar 13, 2023
@sankalp-12
Copy link
Contributor

sankalp-12 commented Mar 17, 2023

I would like to take up this issue. Is the fix just adding a 'crfail' flag to every ZDTM ,desc file?

@mihalicyn
Copy link
Member

I would like to take up this issue. Is the fix just adding a 'crfail' flag to every ZDTM ,desc file?

nope. It's about distinguishing two cases when criu dump process exits with an error and when it crashes.
When crfail flag is set on test, and criu dump exits with an error that's correct (test is passed), but if criu dump crashes, currently, zdtm will report that test is passed, but in fact is should report a test failure.

@sankalp-12
Copy link
Contributor

In the do_run_tests() function in the zdtm.py script file, when the dump action fails but the crfail flag is set, then the test stops. Adding a try and except block in the cr(cr_api, test, opts) (CRIU API) function call to handle signal errors is where I'm currently at.

 if opts['norst']:
            try_run_hook(test, ["--pre-dump"])
            try:
                cr_api.dump("dump", opts=["--leave-running"])
            except subprocess.CalledProcessError as e:
                if e.returncode < 0:
                    raise test_fail_exc("'criu dump' crashed with signal %d" % -e.returncode)
                else:
                    raise test_fail_exc("'criu dump' failed with return code %d" % e.returncode)
        else:
            try_run_hook(test, ["--pre-dump"])
            try:
                cr_api.dump("dump")
            except subprocess.CalledProcessError as e:
                if e.returncode < 0:
                    raise test_fail_exc("'criu dump' crashed with signal %d" % -e.returncode)
                else:
                    raise test_fail_exc("'criu dump' failed with return code %d" % e.returncode)

Should this work?

@mihalicyn
Copy link
Member

@sankalp-12

you can easily test it by adding bug in CRIU code and run some crfail test ;-)

AFAIU you need to pay attention to this piece of code:

            rst_succeeded = os.access(
                os.path.join(__ddir, "restore-succeeded"), os.F_OK)
            if self.__test.blocking() or (self.__sat and action == 'restore' and
                                          rst_succeeded):
                raise test_fail_expected_exc(action) # <--- here we "allow" criu fail if crfail is set. You need to modify this condition a little bit to distinguish the case of criu crash and criu fail.
            else:
                raise test_fail_exc("CRIU %s" % action)

hint: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.returncode

@chirag-ghosh
Copy link

Is this issue still up for grabs? I would like to take it up.

bsach64 added a commit to bsach64/criu that referenced this issue Mar 28, 2024
Adds a exit_signal static method to criu_cli, criu_config and criu_rpc
used to detect a crash.

Fixes: checkpoint-restore#350

Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>
@bsach64 bsach64 linked a pull request Mar 28, 2024 that will close this issue
bsach64 added a commit to bsach64/criu that referenced this issue Mar 29, 2024
Adds a exit_signal static method to criu_cli, criu_config and criu_rpc
used to detect a crash.

Fixes: checkpoint-restore#350

Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>
bsach64 added a commit to bsach64/criu that referenced this issue Apr 17, 2024
Adds a exit_signal static method to criu_cli, criu_config and criu_rpc
used to detect a crash.

Fixes: checkpoint-restore#350

Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants