Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rerun fails with coverage and "dependent" subtests #274

Open
neiser opened this issue Aug 26, 2022 · 12 comments · May be fixed by #387
Open

Rerun fails with coverage and "dependent" subtests #274

neiser opened this issue Aug 26, 2022 · 12 comments · May be fixed by #387
Labels
enhancement New feature or request

Comments

@neiser
Copy link

neiser commented Aug 26, 2022

I've started to use the --rerun-fails option and noticed that when coverage profiling is enabled, the produced coverage.out file only contains the coverage from the failed but re-executed sub-tests.

For example:

$ gotestsum --rerun-fails --packages=. -- -coverprofile=coverage.out -coverpkg="." && go tool cover -func coverage.out
✖  . (2ms) (coverage: 76.9% of statements in .)

DONE 4 tests, 2 failures in 0.271s

✓  . (2ms) (coverage: 46.2% of statements in .)

=== Failed
=== FAIL: . Test_sayHello/Voldemort_makes_it_fail_every_second_time (0.00s)
    main_test.go:16: 
                Error Trace:    /home/agr/oss/gotestsum-rerun-dependent-subtests/main_test.go:16
                Error:          Received unexpected error:
                                afraid of Voldemort, cannot say hello
                Test:           Test_sayHello/Voldemort_makes_it_fail_every_second_time
    --- FAIL: Test_sayHello/Voldemort_makes_it_fail_every_second_time (0.00s)

=== FAIL: . Test_sayHello (0.00s)

DONE 2 runs, 6 tests, 2 failures in 0.509s
github.com/neiser/gotestsum-rerun-dependent-subtests/main.go:11:        sayHello        46.2%
total:                                                                  (statements)    46.2%

The above example usage is from this little demo project to illustrate my issue, see run-test.sh script there.

I've tried to catch the coverage.out from the first run using with --post-run-command="bash -c 'mv coverage.out $(mktemp coverage.out.XXXXXXXXXX)'", but that only produced one file. I assume that's because the post-run command is only executed after all re-runs are done by gotestsum.

Do you have a suggestion how to calculate coverage correctly and/or generate a "merged" coverage.out file from all re-runs?

Futhermore, I'd like to see a command line option such as --rerun-all-subtests which re-runs not only one specific failed sub-test, but essentially all tests of the test case containing the failed sub-test (i.e. find the root of a failed subtest separated by / and run that one again). That would help when a sophisticated integration test is split up into sub-tests and subsequent sub-tests need all previous sub-tests to be run to be successful. I can also prepare an illustration to this problem, but I hope you get the idea. What do you think?

@dnephin
Copy link
Member

dnephin commented Aug 28, 2022

Thanks for opening this issue!

I'd like to see a command line option such as --rerun-all-subtests which re-runs not only one specific failed sub-test, but essentially all tests of the test case containing the failed sub-test

There is a hidden flag --rerun-fails-only-root-testcases which should do what you are describing. It's been a while so I don't remember exactly why I hid it. It might require a small change to filter out sub-tests, but I think it would be a good idea to fix that flag to have it work the way you describe.

Do you have a suggestion how to calculate coverage correctly and/or generate a "merged" coverage.out file from all re-runs?

That's an interesting problem. I think you'd have to solve this by using --raw-command to execute a script that concatenates the files after each run of go test. There's an example of using --raw-command with profiling here: https://github.com/gotestyourself/gotestsum/#custom-go-test-command. I think you could do something similar for coverage.

# test-with-coverage.sh
#!/usr/bin/env bash
set -eu

go test -json -coverprofile=tmp-coverage.out -coverpkg=. "$@"
cat tmp-coverage.out >> coverage.out

That should accept the extra args from gotestsum --rerun-fails (as documented in the first point here), save the coverage to a temp file, then append to the overall coverage file.

That can be used like so:

gotestsum --rerun-fails --packages=.  --raw-command -- ./test-with-coverage.sh

I haven't tried it myself, but something like this should work.

@dnephin
Copy link
Member

dnephin commented Aug 28, 2022

I opened #275 which should address the first request. I renamed the flag to --rerun-fails-run-root-test. Could you try it out to see if it works the way you expect?

@neiser
Copy link
Author

neiser commented Aug 30, 2022

@dnephin Thank you very much for you detailed answer and the PR. I highly appreciate it!

I've commented in #275 already that it solves the part concerning dependent subtests. Concerning the coverage, your idea of concating the the coverage.out did not quite work (the go tool cover did not want to parse that file, complaining about duplice mode: set lines), but using a little extra tool called gocovmerge I was able to get it running, see demo here.

Still, the above solution with a --raw-command is a bit tedious, in particalur because you need to provide an extra script to do the work. Do you think some support for merging coverages in case of reruns could be added to gotestsum?

@dnephin
Copy link
Member

dnephin commented Sep 3, 2022

I guess something like this might also work to remove those extra lines?

go test -json -coverprofile=tmp-coverage.out -coverpkg=. "$@"
grep -v 'mode: set' tmp-coverage.out >> coverage.out

Although maybe you'd then have to re-add the line once as well.

Do you think some support for merging coverages in case of reruns could be added to gotestsum?

I am not sure. Coverage is not something I use much myself, so I'm unlikely to add it. I'm not entirely sure how gotestsum would integrate with test files. I don't want to mirror go test flags and give them different behaviour, because I expect that to be confusing for users.

If you have some idea of how this would look (ex: what command line flags need to be added, what values would they accept, and how would it work at a high level) I'd be happy to review the design and try and work with you to find a clean solution.

@neiser
Copy link
Author

neiser commented Sep 4, 2022

Looking at the implementation of gocovmerge I think it's a bit more involved than simply fiddling with grep to merge the different coverage.out files. That's why I've just used that tool.

The only way I could think of a meaningful integration would be that gotestsum detects the parameter -coverprofile and then replaces that with a temporary file if --rerun is activated. Once all reruns are done, all temporary files are merged into the original file given by the -coverprofile argument. I'd simply copy over gocovmerge.go for merging as it's BSD licensed.

I could try creating a PR for that, I've looked into rerunfails.go and it looks feasible to integrate that logic IMHO. It also doesn't require a new flag for gotestsum AFAICT.

@neiser
Copy link
Author

neiser commented Nov 19, 2022

@dnephin Guess you're busy with other things, but assuming I'd find time to look into this, is there a chance you'd merge such a feature?

@dnephin
Copy link
Member

dnephin commented Nov 20, 2022

I'd like to help you find a solution to this problem, but I'm not keen on maintaining a feature that requires knowledge of the coverage profile (even if it calls out to golang.org/x/tools/cover for a lot of it), because it doesn't seem like we understand how that format is used.

I looked at gocovmerge, and I noticed it says:

If there are source lines that overlap ... the process will exit with an error code.

I would expect a coverage profile generated with --rerun-fails to have overlaps, because it'll re-run the same test a few times. So maybe using gocovmerge as-is won't work? Again I don't know enough about the coverage profile format, or how tools read it, so I'm not really sure how we need to format that file.

Why isn't simply concatenating the lines together sufficient? Is the tool that reads the profile not able to combine duplicate lines? I found this old post https://lk4d4.darth.io/posts/multicover/ written by a former colleague of mine. It sounds like concatenating the profiles may be fine for the set mode, but I guess it doesn't work for count and atomic. I notice gocovmerge does handle those modes by actually summing up the counts. The default mode is set, so if you aren't using -covermode or -race, concat may be sufficient.

I'd much rather solve the problem with a script and --raw-command to show that it works. Once we have a working solution then integrating it into gotestsum should be easier because we can be reasonably confident we understand the format.

Maybe something like this would work to use gocovmerge along with --raw-command ?

test-with-coverage.sh

#!/usr/bin/env bash
set -eu

go test -json -coverprofile=tmp-coverage-${RANDOM}.out -coverpkg=. "$@"

That will generate the coverage profile to a file with a random name each time. That should let us use gocovmerge like this:

gotestsum --rerun-fails --raw-command ./test-with-coverage.sh
gocovmerge tmp-coverage-*.out > coverage.out
rm tmp-coverage-*.out
go tool cover -func coverage.out

If that works correct with --rerun-fails (with failed tests) then I think at least we'll have confirmed that the logic in gocovmerge is reasonably correct for this use case.

From there we could look at making it easier to integrate. Maybe gotestsum can set TEST_RUN_ID or something like that when running commands, so we could use ${TEST_RUN_ID} instead of ${RANDOM}.

I really want to avoid having gotestsum modify the arguments passed to go test (the value of the -coverprofile flag). So far gotestsum only ever adds to the argument list, or reports an error when arguments conflict with gotestsum arguments, but I don't think we're going as far as rewriting argument values. It would be nice to avoid that, and it seems like --raw-command does give us that option.

At the very least it would be great to document how to get this working together with gocovmerge. If you're able to get this suggestion working, I'd like to add it to the docs. If it doesn't work as-is then I'd be very happy to make some changes to get it working.

@dnephin
Copy link
Member

dnephin commented Nov 20, 2022

If it doesn't work that way it's possible that the logic for merging profiles will be involved than what gocovmerge supports. I suspect that tool is for merging separate package runs, not overlapping runs of the same package.

It's possible what we need is instead a way to only gather coverage from the first run, and to omit the coverprofile flag on all subsequent runs. If that's the case then passing TEST_RUN_ID to the --raw-command would also give us a way to solve the problem. We could omit the -coverprofile flag when TEST_RUN_ID != 0. Then maybe gocovmerge will do the rest.

@dnephin dnephin added the enhancement New feature or request label Nov 20, 2022
@neiser
Copy link
Author

neiser commented Nov 21, 2022

@dnephin Thanks for your very detailed reply. I fear you might have missed my quoted demo already above, which shows that the solution you've been proposing pretty much works as expected. Or am I mistaken and you had another point?

@dnephin
Copy link
Member

dnephin commented Nov 26, 2022

Ah, yes I did miss that, I had not realized you had it all working! That's awesome! So the problem left to solve is making it easier to use, is that right?

If gotestsum had some way of doing -coverprofile=$(mktemp coverage.out.rerun.XXXXXXXXXX) with a different value for each test run that would remove the need for --raw-command. Then all that would be required would be to run gocovmerge after gotestsum and remove the extra files. Does that sound reasonable? We can always expand support later.

I'm not sure what the best way to do that would be. Some options that come to mind:

  1. a gotestsum flag --rerun-fails-coverprofile=<filename prefix> which would add -coverprofile=<prefix>-<RUN-ID> to each test run
  2. support interpolation of RUN_ID into the command line args, and use -coverprofile=coverage.out.rerun-{RUN_ID}

If there were any other go test command line flags that could use variable interpolation then I'd probably want to do the second option. I can't find any others right now, so I'm liking the first option.

@ofek
Copy link

ofek commented Jun 14, 2023

Has there been any update on this?

@neiser
Copy link
Author

neiser commented Jun 14, 2023

@ofek I've somewhat lost interest in it as I've solved rerunning tests differently using Gitlab CI retries. Also a fully worked out workaround is available.

dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 5, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 5, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 5, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 5, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 5, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 6, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 6, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 6, 2023
dixler pushed a commit to pulumi/pulumi that referenced this issue Dec 7, 2023
github-merge-queue bot pushed a commit to pulumi/pulumi that referenced this issue Jan 23, 2024
When `--rerun-fails` is specified and a test is re-run, the produced
coverage data only contains the coverage from the failed but re-executed
sub-tests. See gotestyourself/gotestsum#274

This change sees how we fare with this disabled as we've recently done
some work to address test flakiness. We'll be relying on the much less
granular `./scripts/retry` for retries, which doesn't have the coverage
problem. The downside is it will re-run all the tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants