[Testing] Introduce `proptest` property testing library and add type alises for `hashbrown` types #4477

cylewitruk · 2024-03-04T15:16:24Z

This PR introduces the proptest crate with some example property testing tests and wrapper types in stacks-common for hashbrown's HashMap and HashSet types.

Note: This is a break-out from the larger draft PR #4437.

I have provided a number of proptest strategies for common Clarity types, including Contract, TypeSignature, SymbolicExpression and Value (some courtesy of @Acaccia from clarity-wasm).

The hashbrown wrapper types are needed as we cannot implement external crates for external types and I have provided proptest strategies for these new types as well. To use the new types, simply replace std::collections::HashMap/Set or hashbrown::HashMap/Set with clarity_common::types::StacksHashMap/Set as HashMap/Set
This caused problems in the signerlib which uses a combination of hashbrown and std, so given that this isn't strictly needed for this PR I have removed these new concrete types in favor of type aliases.

Note that I have gone through and replaced use::hashbrown::xxx with the above, hence the large number of touched files.

I have also added additional tests for ClarityDatabase which 1) were missing, and 2) demonstrate the usage of proptest.

Resolves Issues

clarity/src/vm/database/tests/clarity_db_tests.rs

jcnelson

I'm not opposed to adding a way to create mocked data structures, but I think this is fundamentally the wrong way to do it.

First, this should not be a dependency at all. It's purely for testing; it should be only a dev-dependency.

Second, it doesn't look like fake is actually doing that much work for us. You're still creating all these constructors for all these data structures, and populating them with Rng-synthesized data. Why have fake at all then?

Third, every data type in Clarity has a bound size by design. But, none of these constructors exercise it. For example, it would be very useful to be able to do things like construct data types of maximum size, or maximum depth, in order to e.g. benchmark the code or iron out resource usage overflows, but that doesn't appear to be something fake does for us.

I'm not convinced that this PR is necessary.

cylewitruk · 2024-03-04T16:25:05Z

First, this should not be a dependency at all. It's purely for testing; it should be only a dev-dependency.

As I wrote in the description, this needs to be included as an optional dependency because of cargo's limitation that it doesn't propagate the test profile to dependencies, even in the same workspace, so #[cfg(test)] in stacks-common will not be visible to clarity or stackslib etc. when built in the test profile -- so that's why it's been conditionalized on the testing feature. This is the recommended workaround according to here and here, and as the testing feature already exists in most crates it seems that this has been a problem previously.

Second, it doesn't look like fake is actually doing that much work for us. You're still creating all these constructors for all these data structures, and populating them with Rng-synthesized data. Why have fake at all then?

I've provided specific implementations to manage recursion and to handle types which require that i.e. two separate Vecs which need to have the same len(). Without this, the recursion is unpredictable and can sometimes cause the test thread to go OOM. Fake provides a lot of boilerplate that frankly would be a waste of time to re-implement for for this project "just because" -- I encourage you to check out their examples.

Third, every data type in Clarity has a bound size by design. But, none of these constructors exercise it. For example, it would be very useful to be able to do things like construct data types of maximum size, or maximum depth, in order to e.g. benchmark the code or iron out resource usage overflows, but that doesn't appear to be something fake does for us.

With a bit more effort this could easily be implemented, very similar to their built-in "locales" support. This isn't a use-case for the moment with basic testing that instances can be serialized, inserted in the database, read from the database, deserialized and be equal.

I'd just like to add that this is intended to be a "first introduction" of the crate -- while this happens to be rather complex and recursive types, I don't think there's too many other types in the codebase which have the same recursive nature, so for the majority of remaining types the simple #[cfg_attr(feature = "testing", derive(::fake::Dummy))] should be enough. If specific constraints apply to fields, they can be attributed individually.

codecov · 2024-03-04T16:33:37Z

Codecov Report

Attention: Patch coverage is 81.11111% with 153 lines in your changes are missing coverage. Please review.

Project coverage is 83.20%. Comparing base (cab1a11) to head (a53aeed).

Additional details and impacted files

@@            Coverage Diff             @@
##             next    #4477      +/-   ##
==========================================
- Coverage   83.26%   83.20%   -0.07%     
==========================================
  Files         451      464      +13     
  Lines      325765   326533     +768     
  Branches      323      323              
==========================================
+ Hits       271240   271678     +438     
- Misses      54517    54847     +330     
  Partials        8        8

Files	Coverage Δ
clarity/src/libclarity.rs	`39.28% <ø> (ø)`
clarity/src/proptesting/callables.rs	`100.00% <100.00%> (ø)`
clarity/src/proptesting/mod.rs	`100.00% <100.00%> (ø)`
clarity/src/proptesting/representations.rs	`100.00% <100.00%> (ø)`
clarity/src/proptesting/types.rs	`100.00% <100.00%> (ø)`
clarity/src/vm/analysis/arithmetic_checker/mod.rs	`92.75% <ø> (ø)`
clarity/src/vm/analysis/read_only_checker/mod.rs	`87.16% <ø> (ø)`
clarity/src/vm/analysis/trait_checker/mod.rs	`100.00% <ø> (ø)`
clarity/src/vm/analysis/type_checker/contexts.rs	`93.50% <ø> (ø)`
...ity/src/vm/analysis/type_checker/v2_05/contexts.rs	`98.25% <ø> (ø)`
... and 53 more

... and 30 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cab1a11...a53aeed. Read the comment docs.

jcnelson · 2024-03-04T17:12:32Z

You deleted this, but I think it's a valid question:

And the fundamentally correct way would be...?

If I were doing this, I would just create a trait called Mocked, similar to Default, which provided the function fn mocked() -> Self, which when called would instantiate the struct with random data from the domain of possible inputs. Then, you could have things like ClarityName::mocked() which evaluate to a ClarityName instance with plausible but synthesized data.

I've provided specific implementations to manage recursion and to handle types which require that i.e. two separate Vecs which need to have the same len(). Without this, the recursion is unpredictable and can sometimes cause the test thread to go OOM.

Right -- you're doing this with our without fake.

Fake provides a lot of boilerplate that frankly would be a waste of time to re-implement for for this project "just because"

It seems it provides a lot of boilerplate that this PR doesn't use? Also, to contrast this with a hypothetical Mocked trait above, it's not like it's hard to implement Mocked for primitive types. If we took that route, we'd likely just define a macro that would let us synthesize implementations for one-liners (kinda like how we do with the impl_stacks_message_codec_for_int! macro to auto-generate code to implement StacksMessageCodec for all the integer types we use).

With a bit more effort this could easily be implemented, very similar to their built-in "locales" support. This isn't a use-case for the moment with basic testing that instances can be serialized, inserted in the database, read from the database, deserialized and be equal.

With a bit more effort we can also just avoid fake altogether ;) More seriously, we're writing this code with or without fake, meaning fake isn't providing value here.

Circling back to your first (deleted) question, I want to explore the line of questioning from @moodmosaic more. A value-add that fake or something like it could bring is easy integration with an existing property-testing framework (or more generally, other widely-used automated testing systems like fuzzers, model checkers, and so on). An argument against my hypothetical Mocked trait would be that we'd have to do all the integration work ourselves, whereas a more standardized system like fake might do this for us. That would be a "fundamentally correct" way to do this IMO -- add the means of synthesizing data structures to the codebase such that we can trivially plug them into an existing automated testing system.

cylewitruk · 2024-03-04T17:49:39Z

You deleted this, but I think it's a valid question:

I deleted it simply because I thought it could come across as unfriendly :)

If I were doing this, I would just create a trait called Mocked, similar to Default, which provided the function fn mocked() -> Self, which when called would instantiate the struct with random data from the domain of possible inputs. Then, you could have things like ClarityName::mocked() which evaluate to a ClarityName instance with plausible but synthesized data.

Just to ensure correct terminology and expectations, this PR introduces faking and not mocking.

There's nothing wrong with the approach you suggest - it was the first thing that came to mind and that which I started with -- but very quickly remembered that there are libraries to do most of this for me. So I'm personally of the opinion that it is a waste of cycles to re-implement, both on the development and maintenance fronts, when there are crates which give us that boilerplate. I would understand the scrutiny much more if this were a production crate, but this is a pretty popular crate which in probably 95% of cases can be used as a dev-dependency as intended; it's just unfortunate that cargo doesn't give us the option of propagating the current build profile to dependencies so that it would be a little bit cleaner instead of having to use the "feature"-workaround.

It seems it provides a lot of boilerplate that this PR doesn't use? Also, to contrast this with a hypothetical Mocked trait above, it's not like it's hard to implement Mocked for primitive types. If we took that route, we'd likely just define a macro that would let us synthesize implementations for one-liners (kinda like how we do with the impl_stacks_message_codec_for_int! macro to auto-generate code to implement StacksMessageCodec for all the integer types we use).

At the moment, sure, but this is just an introduction PR with minimalistic examples. My hope would be that it would be used much more broadly. As you say, it's not hard to implement, but do we really want to muddy-up the Stacks code-base with even more macros etc.? I'd much rather version-pin workspace dependencies and carefully upgrade than re-write code that already exists, or even worse.. principally copying it without those oss projects getting credit.

With a bit more effort we can also just avoid fake altogether ;) More seriously, we're writing this code with or without fake, meaning fake isn't providing value here.

If the idea was that it would be limited to the small number of examples provided here, then I would totally agree. But I think there are a lot of missing tests due to laziness of not wanting to manually fake such types, not at the least the example tests which I added in this PR. And on the other end of the spectrum, probably a lot of unnecessary initialization code which just adds to LoC and maybe could benefit from some randomization for fuzzing/prop-testing.

The real power isn't in the ability to implement custom generators for types, it's in the Dummy-derive which can be used for almost all types in the code-base now that these "more difficult" types have been provided -- including unit tests for a lot of the nested types which don't exist currently.

Circling back to your first (deleted) question, I want to explore the line of questioning from @moodmosaic more. A value-add that fake or something like it could bring is easy integration with an existing property-testing framework (or more generally, other widely-used automated testing systems like fuzzers, model checkers, and so on). An argument against my hypothetical Mocked trait would be that we'd have to do all the integration work ourselves, whereas a more standardized system like fake might do this for us. That would be a "fundamentally correct" way to do this IMO -- add the means of synthesizing data structures to the codebase such that we can trivially plug them into an existing automated testing system.

@moodmosaic and I discussed this briefly prior to this PR -- it would be great if we could get them to integrate, absolutely. It's a giant plus that they both use rand and that fake allows the Rng instance to be provided. I can't imagine that this would be difficult to achieve, and frankly would rather put effort on that than re-inventing wheels.

And last but not least, I used this crate heavily in my #4437 PR draft, and it both saved me a lot of time and increased my confidence in the tests.

clarity/src/proptesting/callables.rs

cylewitruk · 2024-03-07T08:43:49Z

This PR has been repurposed to introduce proptest instead of fake which achieves the same goal, but with the benefit of the type generators being usable for property testing 🚀

Note that most touched files are simply replacing hashbrown::HashMap/Set with the new wrapper types StacksHashMap/Set, which makes it much easier to swap implementations/hashers in the future and allows us to implement external traits on them.

@moodmosaic @jcnelson please re-review.

clarity/src/proptesting/types.rs

clarity/src/proptesting/values.rs

…t-stacks

cylewitruk added optimization Update speed, efficiency, or quality of a feature. testing and removed optimization Update speed, efficiency, or quality of a feature. labels Mar 4, 2024

moodmosaic reviewed Mar 4, 2024

View reviewed changes

clarity/src/vm/database/tests/clarity_db_tests.rs Outdated Show resolved Hide resolved

jcnelson self-requested a review March 4, 2024 15:47

jcnelson requested changes Mar 4, 2024

View reviewed changes

cylewitruk force-pushed the testing/introduce-fake branch from 0dcae6d to f458050 Compare March 4, 2024 17:09

moodmosaic reviewed Mar 6, 2024

View reviewed changes

clarity/src/proptesting/callables.rs Outdated Show resolved Hide resolved

cylewitruk changed the title ~~[Testing] Introduce fake faking library, and wrapper types for hashbrown's HashMap and HashSet~~ [Testing] Introduce proptest property testing library, and wrapper types for hashbrown's HashMap and HashSet Mar 7, 2024

cylewitruk force-pushed the testing/introduce-fake branch from fa78a9b to e7eb3d3 Compare March 7, 2024 08:19

cylewitruk requested review from moodmosaic and jcnelson March 7, 2024 08:43

Acaccia reviewed Mar 7, 2024

View reviewed changes

clarity/src/proptesting/types.rs Show resolved Hide resolved

clarity/src/proptesting/values.rs Show resolved Hide resolved

clarity/src/proptesting/values.rs Show resolved Hide resolved

cylewitruk and others added 12 commits March 7, 2024 19:04

testing: introduce the crate to help with faking complex types

423855f

fix: update faking methods to always use the received instance.

c89c917

fix: update faking methods to always use the received 'Rng' instance.

2d15f35

added new 'StacksHashMap' and 'StacksHashSet' types to 'stacks-common'

9b63f36

wip: new clarity-db tests, fix remaining impls for 'fake'

a706834

wip: moved fake to workspace dep

cd7d30d

wip: add fuzz-loop for tests using fake

ff26172

chore: cargo fmt-stacks

7a468e8

chore: update lockfile

b12c2c8

chore: remove unnecessary include

ec807ed

wip: playing with propmap instead of fake

fc62d41

feat(clarity-proptest): add a contract strategy stub

47b022f

zone117x and others added 16 commits March 7, 2024 19:04

fix: resolve DNS/hostnames for signer node_host value #4466

bde7e77

wip: wrap hashbrown in fasade types

99a543a

wip: wrap hashbrown in fasade types

5a304c8

wip: fixes for StacksHashMap and StacksHashSet

80defd4

wip: cargo fmt-stacks

23c26f0

wip: proptest vs fake

aa37f32

wip: migrating from fake to proptest for generation

0bfd47d

wip: fix for random usize within a range

6f2ec6d

replace fake with proptest

290684a

chore: cargo fmt-stacks

fce54d6

chore: cleanup, fmt

ec8ced6

chore: rebase on next, cleanup & fmt

f0b32a0

added proptest strategy examples in stackslib + dependency issue + fm…

2f22c6c

…t-stacks

wip: attempting to locate dkg timeout issue

265ca00

revert wrapper hashmap/set types, caused problems in stackerlib

ae14e59

fmt-stacks

a53aeed

cylewitruk force-pushed the testing/introduce-fake branch from 9697ec5 to a53aeed Compare March 7, 2024 18:33

cylewitruk changed the title ~~[Testing] Introduce proptest property testing library, and wrapper types for hashbrown's HashMap and HashSet~~ [Testing] Introduce proptest property testing library and add type alises for hashbrown types Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Testing] Introduce `proptest` property testing library and add type alises for `hashbrown` types #4477

[Testing] Introduce `proptest` property testing library and add type alises for `hashbrown` types #4477

cylewitruk commented Mar 4, 2024 •

edited

jcnelson left a comment

cylewitruk commented Mar 4, 2024 •

edited

codecov bot commented Mar 4, 2024 •

edited

jcnelson commented Mar 4, 2024

cylewitruk commented Mar 4, 2024 •

edited

cylewitruk commented Mar 7, 2024

[Testing] Introduce proptest property testing library and add type alises for hashbrown types #4477

Are you sure you want to change the base?

[Testing] Introduce proptest property testing library and add type alises for hashbrown types #4477

Conversation

cylewitruk commented Mar 4, 2024 • edited

Resolves Issues

jcnelson left a comment

Choose a reason for hiding this comment

cylewitruk commented Mar 4, 2024 • edited

codecov bot commented Mar 4, 2024 • edited

Codecov Report

jcnelson commented Mar 4, 2024

cylewitruk commented Mar 4, 2024 • edited

cylewitruk commented Mar 7, 2024

[Testing] Introduce `proptest` property testing library and add type alises for `hashbrown` types #4477

[Testing] Introduce `proptest` property testing library and add type alises for `hashbrown` types #4477

cylewitruk commented Mar 4, 2024 •

edited

cylewitruk commented Mar 4, 2024 •

edited

codecov bot commented Mar 4, 2024 •

edited

cylewitruk commented Mar 4, 2024 •

edited