Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add information to logs with a custom runner #205

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shym
Copy link
Collaborator

@shym shym commented Nov 21, 2022

Add a bash script to use as runner for our test suite

This script adds information to the logs:

  • explicit names of the test run when they start
  • some explicit messages for crashes (SIGSEGV, SIGBUS, ...)
  • and, maybe most importantly, anchors in CI logs, so that the main webpage contains the most important information and direct links to precise points of interest in the log

It is written as a bash script but making sure it can be used as a runner also on Windows CI (notably by commenting all ends of lines, without which it fails with errors about '\r's that don't exist...)

Example of result of the changes: the webpage for this run shows directly that we got a SEGV in Ephemeron on trunk (windows). Or this run with SIGABRT in buffers on trunk on macos.

Add a bash script to use as runner for our test suite

This script adds information to the logs:
- explicit names of the test run when they start
- some explicit messages for crashes (SIGSEGV, SIGBUS, ...)
- and, maybe most importantly, anchors in CI logs, so that the main
  webpage contains the most important information and direct links to
  precise points of interest in the log

It is written as a bash script but making sure it can be used as a
runner also on Windows CI (notably by commenting all ends of lines,
without which it fails with errors about '\r's that don't exist...)
@jmid
Copy link
Collaborator

jmid commented Nov 25, 2022

I'm a bit hesitant about this one.
As someone who has scanned CI logs for the past two weeks I certainly follow you on the benefits!

On the negative side,

  • this adds back in a custom runner now that we finally got rid of the clunky check_error_count
  • it is in bash - which is untyped, has lots of quirks (witness the newline issue), and has caused problems for people before (I think there was even an incident at JaneStreet)

I'm wondering if this is something

  • that could/should be handled in dune or in QCheck_base_runner instead
  • or something we could handle with a custom runner?

I've been thinking of the latter to avoid printing counterexamples twice (with and without return values)...

@shym
Copy link
Collaborator Author

shym commented Nov 28, 2022

I do agree it’d be nicer to go without bash. I had tried first to see if it would be possible to do in dune, but I couldn’t figure it out (see the fact that the error code shown in windows for a segfault is completely different from what it reports in linux).
The custom runner would be the nicest solution, especially if we also gain on the actual output.

@shym
Copy link
Collaborator Author

shym commented Nov 29, 2022

Now, to plead the case for that PR:

  • we are the end users for the test suite and can make choices that would otherwise be clearly dangerously fragile, so I thought that going with bash would not be a problem if it works in the environments where we run the suite,
  • as we see in our logs, segfaults are logged differently between Windows and Unix (negative exit value instead of actual signal) and, according to a few small tests, this problem actually comes with using Unix.system or Sys.command; so writing a custom (OCaml) runner will entail merging both behaviours, what bash is already doing for us.

@shym
Copy link
Collaborator Author

shym commented Dec 1, 2022

For the record, I thought using the echo dune stanza could at least provide the feature of having the path of the test being run before it starts but my attempt failed: https://github.com/shym/multicoretests/actions/runs/3591271602/jobs/6045600378#step:10:41

@jmid
Copy link
Collaborator

jmid commented Jan 31, 2023

Overall, I can see benefit to the CI log anchors and would like to salvage them!

I'm wondering whether a good solution would be to extend QCheck to output the right format when running under a suitable environment variable, e.g., QCHECK_GA_CI as that could benefit more QCheck CI users and avoid introducing a specialized runner script. 🤔

What do you think?

How different an output should be printed to achieve that, e.g., for the below? (I see some "::-header" printing in the PR code)

random seed: 337549857
generated error fail pass / total     time test name
[✓] 1000    0    0 1000 / 1000     1.2s STM Bytes test sequential
[✓]   11    0    1   10 / 1000    57.4s STM Bytes test parallel

--- Info -----------------------------------------------------------------------

Negative test STM Bytes test parallel failed as expected (21 shrink steps):

                             |           
                             |           
                  .---------------------.
                  |                     |           
                To_seq          (Fill (7, 5, 'K'))  


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test STM Bytes test parallel:

  Results incompatible with linearized model

                                                                                               |                                            
                                                                                               |                                            
                                                 .------------------------------------------------------------------------------------------.
                                                 |                                                                                          |                                            
     To_seq : ['a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'K'; 'K'; 'K'; 'K'; 'a'; 'a'; 'a'; 'a']                                (Fill (7, 5, 'K')) : Ok (())                               

================================================================================
success (ran 2 tests)

@shym
Copy link
Collaborator Author

shym commented Jan 31, 2023

The documentation explains here the syntax to report an error and create the anchor; tldr: echo "::error title=Failure in STM Bytes test parallel::STM Bytes test parallel failed on its 11th run after 57.4 seconds with seed 337549857".
That would nice indeed.
The bash runner provides one thing particularly useful for multicoretests that would be harder to bring into QCheck: it can report when a test crashes (on a signal). That would require wrapping all our tests in some sort of fork_prop (and ensuring we manage to report it as the proper signal even on Windows, something that dune fails at, by the way).

@jmid
Copy link
Collaborator

jmid commented Jan 31, 2023

Ah, this old brain is finally starting to understand 😅
That's why you need a "parent process" to catch crashes and print the anchor (rather than just crash and burn).

For the weird Windows signalling, I think we should report it along with test case if possible to the dune developers.

@jmid
Copy link
Collaborator

jmid commented Jan 31, 2023

BTW, thanks for the documentation link - much appreciated! 🙏
Potentially QCheck could ::group ... ::endgroup a list of tests in a test suite in such a mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants