Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How are StateT s Gen a and GenT (State s) a supposed to be used? #492

Open
ChickenProp opened this issue Jun 15, 2023 · 1 comment
Open

Comments

@ChickenProp
Copy link
Contributor

ChickenProp commented Jun 15, 2023

I have a test pattern that looks like: when I generate objects, put them in state; then when I generate other objects that reference them, I can look them up. So if a File references a Folder, and I want to create a File, I have two options:

  • Generate a Folder explicitly and pass it to the File generator
  • Look up the list of Folders in state and either pick one of them or generate a new one using some default generator

This makes it easy to say "give me three different Files, which may or may not be in the same Folder".

But, it seems that shrinking doesn't work like I'd hoped. Here's a simple demonstration:

numGen1 :: StateT [Int] Gen (Int, [Int])
numGen1 = do
  a <- do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num
  b <- do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num

  numsState <- State.get
  (, [b, a]) <$> Gen.element numsState

numGen2 :: StateT [Int] Gen (Int, [Int])
numGen2 = do
  numsGen <- Gen.list (Range.constant 2 2) $ do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num

  numsState <- State.get
  (, numsGen) <$> Gen.element numsState

I'd hope these would be basically the same: generate two numbers, save them in state, then pick one of the numbers that was saved. We have runTateT numGen_ [] :: Gen ((Int, [Int]), [Int])) and for every value generated, the two lists should be equal and the single value should be contained in them.

This property holds for the immediately generated values, and for shrunk values from numGen1:

ghci> Gen.printTree $ runStateT numGen1 []
((7,[6,7]),[6,7])
 ├╼((0,[6,0]),[6,0])
 │  ├╼((0,[0,0]),[0,0])
 │  │  └╼((0,[0,0]),[0,0])
 │  ├╼((0,[3,0]),[3,0])
 │  │  ├╼((0,[2,0]),[2,0])
...

But it fails for shrunk values from numGen2:

ghci> Gen.printTree $ runStateT numGen2 []
((10,[10,1]),[10,1])
 ├╼((0,[10,0]),[0])
 │  ├╼((0,[0,0]),[0,1])
 │  ├╼((5,[5,0]),[5,1])
 │  │  ├╼((3,[3,0]),[3,1])
 │  │  │  └╼((2,[2,0]),[2,1])
 │  │  │     └╼((1,[1,0]),[1,1])
...

where the two lists are different, though somehow the single value is still contained in both of them.

I think the culprit is that Gen.list does something complicated:

list :: MonadGen m => Range Int -> m a -> m [a]
list range gen =
  let
     interleave =
       (interleaveTreeT . nodeValue =<<)
  in
    sized $ \size ->
      ensure (atLeast $ Range.lowerBound size range) .
      withGenT (mapGenT (TreeT . interleave . runTreeT)) $ do
        n <- integral_ range
        replicateM n (toTreeMaybeT gen)

interleaveTreeT :: Monad m => [TreeT m a] -> m (NodeT m [a])
interleaveTreeT =
  fmap Tree.interleave . traverse runTreeT

it's not clear to me why it does this, but the term "interleave" makes me think it's about rearranging the shrink tree, where the default behavior would only shrink one element at a time?

(If we replace Gen.list (Range.constant 2 2) with replicateM 2, then numGen2 behaves like numGen1.)

So I guess I'm asking if this kind of thing is expected behavior for StateT s Gen a; and is there a way to do the kind of thing I'm trying to do without avoiding Gen.list entirely?

I've wondered about using GenT (State s) a instead, but I don't know how that would work. There's hoist to turn it into a Gen a, and hoist (Identity . flip evalState []) typechecks. Does it work? It's not obviously wrong, I can simply change the type of numGen2 to GenT (State [Int]) (Int, [Int]) and do

ghci> Gen.printTree $ hoist (Identity . flip evalState []) numGen2'
(4,[9,4])
 ├╼(0,[0,4])
 │  ├╼(0,[0,0])
 │  ├╼(2,[0,2])
 │  │  └╼(1,[0,1])
 │  └╼(3,[0,3])
...

...but I've lost access to the state variables here, so I can't tell what's going on with that, and passing in [] feels like I might be losing state somewhere? But I don't know. Similarly I could use forAllT, but then I'd need to turn a PropertyT (State s) into a PropertyT IO, which feels like it would have the same problem with [].

So, will that do what I want? I'm not really sure how to investigate further other than "try it and hope I don't run into errors that I don't understand".

@ChickenProp
Copy link
Contributor Author

Ah, no, it doesn't work. I can return numsState directly and I get

numGen2' :: GenT (State [Int]) (Int, [Int], [Int])
numGen2' = do
  numsGen <- Gen.list (Range.constant 2 2) $ do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num

  numsState <- State.get
  (, reverse numsGen, numsState) <$> Gen.element numsState

ghci> Gen.printTree $ hoist (Identity . flip State.evalState []) numGen2'
(9,[7,9],[7,9])
 ├╼(0,[7,0],[0])
 │  ├╼(0,[0,0],[0])
 │  ├╼(4,[4,0],[4])
 │  │  ├╼(2,[2,0],[2])
 │  │  │  └╼(1,[1,0],[1])
 │  │  └╼(3,[3,0],[3])
...

which looks interestingly different from the results from numGen2, but still not what I'm hoping for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant