Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recheck is slow #419

Open
AlexeyRaga opened this issue Jul 29, 2022 · 22 comments
Open

Recheck is slow #419

AlexeyRaga opened this issue Jul 29, 2022 · 22 comments

Comments

@AlexeyRaga
Copy link

Not sure if it is known or by design, but running a test with recheck is more than 10x slower than having the same test run.

It feels strange knowing that recheck only runs 1 tests while check does a bunch or runs and a bunch of shrinks and it is still faster.

I am takling about figures like:

check: 22 tests, 51 shrink, failed in 0.9+ seconds
recheck: 1 test, failed in 9+ seconds

@TysonMN
Copy link
Member

TysonMN commented Jul 29, 2022

Wow! Recheck is supposed to be faster. In particular, recheck is supposed to only run your test once, so any breakpoint in your code is only hit once.

I need more information to understand the problem.

Can you reproduce the behavior with code you can share?

@AlexeyRaga
Copy link
Author

It does seem to run once and a breakpoints only hit once, I confirm.
But somehow it is much slower to run.

there is this hedgehogqa/fsharp-hedgehog-xunit#7 but it looks like a separate issue.

I will see if I can find time to build an artificial case to reproduce.

@dharmaturtle
Copy link
Member

In particular, recheck is supposed to only run your test once, so any breakpoint in your code is only hit once.

Potentially relevant: #420

@TysonMN
Copy link
Member

TysonMN commented Jul 29, 2022

Thanks for the reminder @dharmaturtle.

@AlexeyRaga, does the last line of your CE include return? That is necessary in order to only retest the failed input.

Even so, I don't think this fully explains your experience.

@dharmaturtle
Copy link
Member

@AlexeyRaga are you using C# or F#? Additionally, are you using Hedgehog Experimental and/or Hedgehog Xunit?

@AlexeyRaga
Copy link
Author

@dharmaturtle I use both. I don't believe that it does (or could) matter though. I do not nave anything in the CE.
I add latest versions of all 3: Hedgehog, Hedgehog.Experimental and Hedgehog.Xunit.

Here is an example that reproduces the issue for me, and I found something weird. So I put all the info into this comment, as it is still related.
Let's stick with F# only.

A function under test:

module FSModule

let mutable count = 0
let add x y =
    count <- count + 1
    printfn $"Count: {count}"
    let res = x + y
    if res > 50 then x else res

And here is the test:

let ``Test with Property attribute`` (x : int32, y : int32) =
    let res = FSModule.add x y
    res = x + y

When I run it with [<Property>] attribute, it predictably fails and I get the recheck from that failure.
For me it now is: "19_11670738321710543822_15170413700225871527_1010101010".

When I add Recheck("19_11670738321710543822_15170413700225871527_1010101010") attribute and re-run the test, I see this in my console:

Count: 1
...
Count: 60

So it confirms that the test runs multiple times.

Now here is where it gets strange.

I add this "Plain" Hedgehog test:

[<Xunit.Fact>]
let ``Test with plain Hedgehog`` () =
    let prop =
        property {
            let! x = Gen.int32 (Range.linearBounded())
            let! y = Gen.int32 (Range.linearBounded())
            return x + y = FSModule.add x y
        }
    prop |> Property.recheckBool "19_11670738321710543822_15170413700225871527_1010101010"

Notice the same recheck data.

And this test passes!. This is totally unexpected to me. But maybe it is a different issue (if it is an issue). Or maybe it is not an issue? Can it be because different bounds were used for int32 in these two cases or something like it?

Anyway, now run this test as prop |> Property.checkBool to grab the failing recheck data.
I also notice that the test prints 48 iterations before it fails.

I get the recheck and run it with
Property.recheckBool "0_7333683850760263877_13670925995324076957_01010101010101010101010101010101010110110"

Now the test fails printing 338 iterations!
So maybe it is an upstream issue! But I am pretty sure that I saw it doing just one iteration before. I am totally confused now.

@AlexeyRaga
Copy link
Author

I have created a repo here if it can be helpful:

https://github.com/AlexeyRaga/fsharp-csharp-tests/blob/main/FSharpTest/Tests.fs

@TysonMN
Copy link
Member

TysonMN commented Jul 31, 2022

When I add Recheck("19_11670738321710543822_15170413700225871527_1010101010") attribute and re-run the test, I see this in my console:

Count: 1
...
Count: 60

That is hedgehogqa/fsharp-hedgehog-xunit#7. @dharmaturtle has a fix for it. There will be a new release of the NuGet package soon.

And this test passes!. This is totally unexpected to me. But maybe it is a different issue (if it is an issue). Or maybe it is not an issue? Can it be because different bounds were used for int32 in these two cases or something like it?

Yes, the generators have to be exactly the same, including the bounds. My guess is that this explains this issue.

I get the recheck and run it with
Property.recheckBool "0_7333683850760263877_13670925995324076957_01010101010101010101010101010101010110110"

Now the test fails printing 338 iterations!
So maybe it is an upstream issue! But I am pretty sure that I saw it doing just one iteration before. I am totally confused now.

Property.recheckBool didn't have efficient rechecking. See #420. This is a limitation imposed by F#.

However, we could minimize the impact of the F# limitation by not exposing functions in our API that encourage users to "run into" this F# limitation.

I am inclined to remove all the Property.*bool* functions.

@dharmaturtle
Copy link
Member

dharmaturtle commented Jul 31, 2022

However, we could minimize the impact of the F# limitation by not exposing functions in our API that encourage users to "run into" this F# limitation.

I'm in favor. If done, we'll need to be careful though. Tests that return false may "pass" by virtue of not being checked.

@JohnEffo
Copy link
Contributor

JohnEffo commented Jul 3, 2023

This is an issue for me, especially if I combine, choice with list, the test runs fast and the shrink is slow, when I ran into this trying to write demo code the property took so long to recheck I initially thought I'd crashed the IDE, this is not as slow but still noticable.

using System.Collections.ObjectModel;
using Gen = Hedgehog.Linq.Gen;
using Range = Hedgehog.Linq.Range;
using Linq;
public class ChoiceRecheck
{
    [Fact]
    public void Choice()
    {
        var low = Gen.Int32(Range.Constant(0, 5));
        var mid = Gen.Int32(Range.Constant(10, 50));
        var big = Gen.Int32(Range.Constant(100, 200));
        var large = Gen.Int32(Range.Constant(500, 1000));
        var chioce = Gen.Choice(new Collection<Gen<int>> { low, mid, big, large }).List(Range.Constant(100,200));

        var prop = Property.ForAll(chioce).Select(x => x.Any(Test ));
        //prop.Check();
        prop.Recheck("0_8474167946941752373_7006309460388453401_000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000");

    }

    public bool Test(
        int x) => x == 990;
}

@TysonMN
Copy link
Member

TysonMN commented Jul 4, 2023

@JohnEffo, can you open a new issue for that?

That is hedgehogqa/fsharp-hedgehog-xunit#7. @dharmaturtle has a fix for it. There will be a new release of the NuGet package soon.

@dharmaturtle, I was that NuGet package released? If so, what version was it?

@dharmaturtle
Copy link
Member

0.5.0

@TysonMN
Copy link
Member

TysonMN commented Jul 4, 2023

@AlexeyRaga, does that version resolve your problem?

@AlexeyRaga
Copy link
Author

@TysonMN I am honestly lost by now what workarounds should be applied and in which combination... :(

Should I use or not use return? Am I allowed or not allowed to use checkBool and recheckBool? In which combination with return does it or does not it work? In which case it half-works and in which case it doesn't?

No, this code below still doesn't work as intended and recheckBool still runs tons of iterations and arrives at an entirely different result at the end.

I am using the latest versions of all the Hedgehog and XUnit packages.

let cond() =
    let mutable n = 0
    fun (str : string) ->
        n <- n + 1
        Console.WriteLine $"{n}: {str}"
        str.Contains 'r'

[<Fact>]
let ``My test`` () =
    let cnd = cond()
    property {
        let! str = Gen.string (Range.constant 1 5) Gen.alpha
        return (cnd str)
    }
    |> Property.checkBool
    // |> Property.recheckBool "0_4278704406483328434_14594425247001969359_0000"

I naively hope that there can be the world in which an invalid program would be rejected by the compiler, and an accepted one would work as intended...

I am now very hesitant to use Hedgehog in my team because of this very tricky and unobvious thing that is really hard to communicate across the team members :(

@TysonMN
Copy link
Member

TysonMN commented Jul 15, 2023

I have a longer comment planned in my head, but I can't write it all now. Here is a brief response.

Quoting the OP:

I am takling about figures like:

check: 22 tests, 51 shrink, failed in 0.9+ seconds recheck: 1 test, failed in 9+ seconds

In hindsight, the problem you describe here is identical to what @JohnEffo describes in this comment. I asked @JohnEffo to create a new issue (#432) because I incorrectly thought it was a different issue. Then I fixed that issue in PR #433.

@JohnEffo
Copy link
Contributor

@TysonMN Thanks for looking into this, I have not pulled the latest version and checked it out, I shall do so ASAP and get back to you.

@TysonMN
Copy link
Member

TysonMN commented Jul 15, 2023

Now here is where it gets strange.

I add this "Plain" Hedgehog test:

[<Xunit.Fact>]
let ``Test with plain Hedgehog`` () =
    let prop =
        property {
            let! x = Gen.int32 (Range.linearBounded())
            let! y = Gen.int32 (Range.linearBounded())
            return x + y = FSModule.add x y
        }
    prop |> Property.recheckBool "19_11670738321710543822_15170413700225871527_1010101010"

Notice the same recheck data.

And this test passes!. This is totally unexpected to me. But maybe it is a different issue (if it is an issue). Or maybe it is not an issue? Can it be because different bounds were used for int32 in these two cases or something like it?

Yes, the generators have to be exactly the same, including the bounds. My guess is that this explains this issue.

Now I think I know exactly the problem. You combined the generators monadically (by using let! twice). I expect that @dharmaturtle implemented the behavior in Hedgehog.Xunit applicatively (which in your code would have been to use let! and then and!).

@dharmaturtle, can you confirm that Hedgehog.Xunit implements this behavior applicatively?

@AlexeyRaga, can you try that code after replacing the second let! with and! to see if that matches your expectations (i.e. the test should fail on the same shrunken input)?

@TysonMN
Copy link
Member

TysonMN commented Jul 15, 2023

Property.recheckBool didn't have efficient rechecking. See #420. This is a limitation imposed by F#.

However, we could minimize the impact of the F# limitation by not exposing functions in our API that encourage users to "run into" this F# limitation.

I am inclined to remove all the Property.*bool* functions.

I should be careful when claiming that something is impossible. I was wrong. I still think there is a related limitation imposed by F#, but it doesn't prevent Property.recheckBool from benefitting from efficient rechecking (i.e. only the shrunken input is tested).

PR #437 implements efficient rechecking for Property.recheckBool.

I no longer think there is an efficiency problem with all the Property.*bool* functions, so they can stay.

@dharmaturtle
Copy link
Member

@dharmaturtle, can you confirm that Hedgehog.Xunit implements this behavior applicatively?

Yep, done here. (Property.forAll => BindReturn)

@dharmaturtle
Copy link
Member

A short explanation of monadic vs appliciative behavior

You combined the generators monadically

Alright let's translate some of TysonMN's FP nerdspeak (a monad is just a monoid in the category of endofunctors what's the problem) to mortalspeak. You can combine generators in two ways: monadically, or applicatively. Let's say you have two gens:

        property {
            let! x = Gen.int32 (Range.linearBounded())
            let! y = Gen.int32 (Range.linearBounded())
            return x + y = FSModule.add x y
        }

The example above is monadic; it uses let! and let!. Which means that x must be calculated before y. In contrast...

        property {
            let! x = Gen.int32 (Range.linearBounded())
            and! y = Gen.int32 (Range.linearBounded())
            return x + y = FSModule.add x y
        }

The example above is applicative; it uses let! and and!. Which means that x and y are calculated simultaneously.

Why does this matter? Applicatives allow for more efficient rechecking (compared to combining monadiclly). Here's an intuitive explanation:

  • For a monadic combination, x is first calculated, then y to arrive at the final result. (Serial.)
    • If there's an error, if x is shrunk, y must also be shrunk. (y, in theory, may be shrunk independent of x, e.g. holding x constant.)
  • For an applicative combination, x is calculated "concurrently" with y to arrive at the final result. (Parallel.)
    • If there's an error, x can be shrunk independently of y.

I'm intentionally using language that hints at multithreading to trigger your intuition.

Applicative behavior isn't "better" than monadic behavior. Monadic behavior allows you to declare dependencies between generators, e.g.

        property {
            let! x = Gen.int32 (Range.linearBounded())
            let! y = Gen.constant x
            return x + y = FSModule.add x y
        }

It is not possible to write the above in an applicative manner - if x is shrunk, so must y.

In summary, where possible, choose applicative behavior because it allows for more efficient rechecking. However this is not always possible so you may be forced to choose monadic behavior. It's a tradeoff between power (you can express more with monadic) and rechecking efficiency.

@TysonMN please feel free to tear this apart :)

  • This is an explanation of and! using Result
  • This is a short explanation of and! using Option

@TysonMN
Copy link
Member

TysonMN commented Jul 16, 2023

@TysonMN please feel free to tear this apart :)

Very good. Minor quibbles. Here is one correction.

Applicatives allow for more efficient rechecking shrinking (compared to combining monadiclly).

@TysonMN
Copy link
Member

TysonMN commented Jul 16, 2023

@TysonMN I am honestly lost by now what workarounds should be applied and in which combination... :(

Sorry about that. Thanks for sticking it out though. You are helping to make Hedgehog better! I appreciate it :)

Should I use or not use return?

You should. If you don't, then when rechecking, your test code could be called more than once (instead of only once on the shrunken value). Requiring that every Property CE includes a return statement is still an improvement I have yet to make (or figure out if F# allows me to implement efficient rechecking without a return statement in the Property CE).

Am I allowed or not allowed to use checkBool and recheckBool?

In the current version (which is 0.13.0), recheckBool does not have efficient rechecking. Now that PR #437 has merged, recheckBool does have efficient rechecking. I plan to create a new release soon with this improvement.

In which combination with return does it or does not it work?

Those two issues are independent of each other.

In which case it half-works and in which case it doesn't?

I am unaware of any case in which something half works.

No, this code below still doesn't work as intended and recheckBool still runs tons of iterations and arrives at an entirely different result at the end.

I am using the latest versions of all the Hedgehog and XUnit packages.

Does the code work as intended when using the current version of master (now that PR #437 is merged)?

I naively hope that there can be the world in which an invalid program would be rejected by the compiler, and an accepted one would work as intended...

"Invalid" is a bit strong here. I believe that we are contrasting code that does or does not have efficient rechecking. F# Hedgehog was initially released in 2017. In July of 2021, a user essentially requested (in #332) the feature that I now call efficient rechecking. With great effort, I implemented this feature and merged it in December of 2021 (via PR #336). Like all software, its impact was reduced because of limitations imposed by existing code and by bugs. Since then, this situation has continued to improve. See the change log for a list of these improvements. With your help, this list of improvements is now longer. Thanks for helping to make that possible.

I am now very hesitant to use Hedgehog in my team because of this very tricky and unobvious thing that is really hard to communicate across the team members :(

I don't want Hedgehog to be (very) tricky or unobvious. Please create an issue for each thing that you think is tricky or unobvious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants