Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple ts in Lin and Lin_api signatures #112

Open
wants to merge 60 commits into
base: main
Choose a base branch
from
Open

Conversation

jmid
Copy link
Collaborator

@jmid jmid commented Aug 16, 2022

This PR takes a stab at supporting multiple ts in the signature descriptions of Lin and Lin_api, thus fixing the Lin part of #62.

Overall, it supports multiple ts by using an underlying t array:

  • Initially the init result goes to index 0
  • New ts are always saved to fresh subsequent indices
  • The two spawned Domains can refer to ts from the sequential prefix and their own produced ts. This is achieved through an environment-based generation and thus should avoid race-conditions.
  • Note: the array is initialized to have all entries point to the same initial init result

The different ts are distinguished as variables, encapsulated with relevant functions in a Var module.
The environment handling is similarly encapsulated in an Env module.
To make it clear what variable t0 refers to, we include an initial let t0 = init () in the printed cmd triples.

The PR is probably best read in steps, mirroring its development:

  1. getting it to work for the raw Lin interface - this required changing the API
    • here gen_cmd is extended to take a variable generator as argument and
    • gen_cmd should return a generator of pairs, extended with a first component Var.t option to signal whether to save a t and if so, the variable to hold it. A side-effecting, gen-sym-like Var.next () is available to generate fresh variables
    • the Var.t option is also used to indicate which ts need cleanup (this should avoid double cleanup)
  2. getting it to work for the combinator-based Lin_api interface
    • here FnState is extended to carry a Var.t variable for identifying the desired state.
    • since we don't want to compare newly created ts for equality (only store them), they should use the [returning_] and [returning_or_exc_] combinators (note the final underscore _)
  3. updating all the lin_test.ml examples broken by the API change
    • with the gen_cmd signature change, this means we can no longer benefit from ppx_deriving_qcheck as it produces an unparameterized generator.
  4. in addition some lin_test_dsl.ml examples are extended to cover a larger subset of the tested API. Locally the extended Bytes test now triggers a segfault...

@jmid jmid force-pushed the multiple-ts branch 2 times, most recently from a6a025d to 542c417 Compare August 19, 2022 07:55
@jmid
Copy link
Collaborator Author

jmid commented Aug 19, 2022

Rebased on latest main.
Latest changes comment out the unsafe Bytes.escaped and include increasing the chance of producing counterexamples to sequential consistency.

The PR is still missing a way to shrink left-hand-sides of t-producing cmds: let t5 = Bytes.make 27 'z'.

@jmid
Copy link
Collaborator Author

jmid commented Sep 1, 2022

Just rebased on latest main

@jmid
Copy link
Collaborator Author

jmid commented Sep 2, 2022

5/8 CI runs fail to trigger the Bytes failure within the 1000 iterations, which makes the CI turn red.

@jmid
Copy link
Collaborator Author

jmid commented Sep 22, 2022

Rebased on latest main

@jmid
Copy link
Collaborator Author

jmid commented Sep 22, 2022

Rebased on latest main

lib/lin.ml Outdated Show resolved Hide resolved
@jmid
Copy link
Collaborator Author

jmid commented Oct 13, 2022

Rebased on latest main to try it out on 5.0.0~beta1 and with a bytecode CI target

@jmid
Copy link
Collaborator Author

jmid commented Oct 14, 2022

CI is all red, so I've spent some time trying to understand why:

  • On Linux it generally fails because the Lin_api Bytes issue is not triggered
  • On MacOSX it generally fails because the Lin_api Bigarray issue not triggered

I think we need to better understand why this PR no longer triggers these issues and address that before merging.

A few other things:

In one case on MacOSX there was a Lin_api Array test with Domain with crazy many reduction attempts (114580!):
https://github.com/jmid/multicoretests/actions/runs/3245274353/jobs/5322586745
Here we probably need to use a better t_var shrinking strategy (bisection?) to limit the number of attempted var_fixes?

Finally, I realized that returning_ described as [returning comb] indicates the return type [comb] which is ignored.
actually doesn't ignore a t, but saves it. This is clearly a brain-fart on my part that needs to be addressed somehow 🤷

@shym
Copy link
Collaborator

shym commented Oct 14, 2022

I wonder whether some recent modifications of the compiler might not be making some expected-to-fail tests require more runs to appear. To try and start testing that hypothesis, I just opened PR #157 (based on main).

@shym
Copy link
Collaborator

shym commented Oct 14, 2022

I also think “ignored” is confusing. I suppose you meant something like “will not be used to check consistency between linearized and non-linearized runs”, didn’t you? Part of the confusion is the fact that results of type “t” always fall into that category, which should be explicitly stated.

@jmid
Copy link
Collaborator Author

jmid commented Oct 14, 2022

I wonder whether some recent modifications of the compiler might not be making some expected-to-fail tests require more runs to appear. To try and start testing that hypothesis, I just opened PR #157 (based on main).

I think part of the reason may be

  • we are going from testing 6 to 14 type signatures in src/bytes/lin_tests_dsl.ml for example. This should decrease the chance of generating a random test that combines "the right ones".
  • of these 14 signatures 5 return a new t. This further increases the chance of triggering issues that arise from simultaneous mutation of the same t.

I've tried to counter these by playing with weights. This does not solve the issue entirely, so other things may play a role too, e.g., fetching and storing from a global array of ts may take more time than when we only has a single global t, thus affecting our timing of things on top of each other 🤔

P.s. I also spotted a duplicate Bytes.make in src/bytes/lin_tests_dsl.ml...

@jmid
Copy link
Collaborator Author

jmid commented Oct 14, 2022

I realized that reducing the init size in src/bytes/lin_tests_dsl.ml from 42 to 8 gave a nice output, but unexpectedly reduced our ability to trigger the Bytes issue. I have therefore restored it to 42...

@jmid
Copy link
Collaborator Author

jmid commented Oct 16, 2022

This is improving as it no longer all red! 😃

There are still a few outstanding issues:

Less serious:

  • two bytecode runs are failing to trigger int64 ref Thread (like main)

@shym
Copy link
Collaborator

shym commented Oct 19, 2022

Regarding Bigarray failures on macos, I opened #160 and #161, the combination of which I’m testing on my integration branch. They give reproducibility in my first rounds of testing.
#161 should also address, somehow, the less serious int64 ref Thread, sometimes trading failure to fail for timeouts, unfortunately.

@jmid
Copy link
Collaborator Author

jmid commented Oct 20, 2022

Rebased on latest main to pick up the combined effect of #160 (on multiple-ts) and #161 (on main)

@shym
Copy link
Collaborator

shym commented Oct 20, 2022

As main and multiple-ts contain naturally conflicting changes, I find it makes more sense (at least to me) to merge instead of rebasing. That’s why I created my integration branch, especially that commit. I’m not sure github UI can show how conflicts were resolved, like a git show --cc aae766b though.

@jmid
Copy link
Collaborator Author

jmid commented Oct 20, 2022

As main and multiple-ts contain naturally conflicting changes, I find it makes more sense (at least to me) to merge instead of rebasing.

Not sure I follow you here. There will be conflicts to manually resolve when either merging or rebasing, no?
In the above case, I messed up when manually resolving the rebasing conflicts - which I could just as well have done while resolving a merge conflict. 🤔

@shym shym mentioned this pull request Oct 21, 2022
@shym
Copy link
Collaborator

shym commented Oct 21, 2022

That’s probably a matter of preference, but by merging instead of rebasing, while the branch is still WIP, you record a trace of how conflicts were resolved. When the branch is ready I also tend to rebase it (if only to clean it up).

@jmid
Copy link
Collaborator Author

jmid commented Oct 28, 2022

In 5e460ad I've now tried to rephrase the intended semantics of the _returning combinators.

Overall this feature is starting to shape up nicely - thanks for helping out @shym!

For now, I want to let the CI complete a run.
I seem to recall having seen a CI log spending needlessly long time shrinking - which I suspect may be due to the added t_var shrink renaming. After the CI run completes, I therefore intend to go over the CI logs with a comb, to see whether this is an issue.

@jmid
Copy link
Collaborator Author

jmid commented Feb 2, 2023

I've now taken a first stab at merging main into this one.
It will require some more polishing and revision, but I wanted to throw it after the CI targets to test it out.
Locally, it seemed to not trigger the expected parallel Dynlink failure.

@jmid jmid mentioned this pull request Feb 2, 2023
@jmid
Copy link
Collaborator Author

jmid commented Feb 3, 2023

To summarize the latest CI run:

  • 3 threadomain failures - 1 Windows bytecode crash + 2 Windows timeouts (trunk + 5.0.0 bytecode)
  • 2 Windows Dynlink crashes (5.0.0 + trunk)
  • 1 Bigarray test failure not triggering an expected issue on macOS trunk
  • 1 Lin Weak Hashset test timing out due to excessive shrinking (17766.5 secs!) on Windows 5.0.0

@jmid jmid mentioned this pull request Feb 24, 2023
@jmid
Copy link
Collaborator Author

jmid commented Feb 27, 2023

a1dc261 offers a simplification of the Lin* interfaces by defining a cmd_triple record type.

@jmid
Copy link
Collaborator Author

jmid commented Mar 7, 2023

Resolved conflicts from merging main into this one

@jmid
Copy link
Collaborator Author

jmid commented Mar 10, 2023

The test suite is failing in a number of different ways with the current PR.
I tested on a MacMini, where increasing the frequency of Bigarray.Array.get makes the expected failure trigger more consistently. A number of other failures however still needs to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants