Add hypothesis tests for data operators. #1957

hodgestar · 2022-07-16T16:12:57Z

Description
Add hypothesis strategies for data objects and some simple property-based tests for data operations.

We aim for compatibility with numpy, but there are caveats in few different cases:

Operations	Non-finites Equivalent	Notes
Creation & negation		No caveats
Equality checks		The same tolerances must be specified
Addition & subtraction	Yes
Scalar multiplication & division	Yes	CSR may raise an exception if the scalar is non-finite
Matrix multiplication	Not tests	Magnitude range is restrict to ensure precision
Trace, adjoint, transpose, conj & copy		No caveats

In some cases we also need to ignore warnings raised by numpy about operations with nan and inf when calculating the expected result

Related issues or PRs

None

hodgestar · 2022-07-16T16:15:21Z

@Zac-HD Any suggestions for ways we could use hypothesis better? It all looks quite clean right now.

Is there a way we can change what hypothesis prints when an example is found? E.g. we don't want the data layer object repr to include the contents of the array, but it would be nice to see that in the test failure output.

Zac-HD

All looks good to me!

Is there a way we can change what hypothesis prints when an example is found?

You can't change the failing example (except by changing the repr or hacking our internal pretty-printer - please don't!), but you can add a call to hypothesis.note("whatever representation you want to see") inside the test function, and you'll see that on failure.

qutip/tests/core/data/test_operators_hypothesis.py

AGaliciaMartinez

The hypothesis testing approach seems to generate very clean and readable code! I really like this approach for testing as seems to simplify writing tests. I see that some tests do not fully work yet but the errors seem to be mostly related to inf and nan values not being properly handled. We can skip those tests by using the elements argument in npst.arrays but im not sure if we should. Also, absolute tolerances may need to be included in tests if we test with values close to zero.

@hodgestar What is the goal for this PR? Is this meant to ultimately replace the current testing code employed for the data layer? or is it just mean to test Data's operators?

qutip/tests/core/data/test_operators_hypothesis.py

…ectly shape 1d arrays.

… errors associated with these.

Zac-HD · 2022-08-06T18:27:48Z

(unsubscribing because I think there's not much for me to add, but please @ me if that changes)

…tmul.

…apes.

hodgestar · 2022-08-07T13:12:04Z

@Ericgig @AGaliciaMartinez This is ready for another round of review.

I fixed small bugs in mul and iszero -- we should discuss whether the fixes are good or not.

There are still some failing examples I'm encountering locally, that we should also decide what to do with:

scalar division:

# result = a / x
# numpy returns array([[nan+nanj]])
# our result:
result [Data, Dense]: array([[8.98846567e+307+8.98846567e+307j]])
a [Data, Dense]: array([[9.97920155e+291+1.79769313e+308j]])
x: (1+1j)

matmul:
The current failing example I'm seeing locally is just an atol issue encountered for large exponents, but I have encounter other failing examples still that I need to dig into.

hodgestar · 2022-08-07T13:15:52Z

@Zac-HD Would you mind doing a quick sanity check of my MatrixShapesStrategy class (

qutip/qutip/tests/strategies.py

Lines 15 to 62 in 2aee70c

    
           class MatrixShapesStrategy(st.SearchStrategy): 
        
               """ A strategy for producing compatible matrix shapes. 
        
                   Parameters 
        
                   ---------- 
        
                   shapes : list of shape specifications 
        
                       Each shape specification show be a tuple of dimension labels, 
        
                       e.g. ("n", "k"). Dimensions with the same labels will be 
        
                       given the same length in the resulting shapes. Fixed dimensions 
        
                       may be specified with an integer constant, e.g. ("n", 3). 
        
                   min_side : int 
        
                       The minimum length of any one dimension. Default 0. 
        
                   max_side : int 
        
                       The maximum length of any one dimension. Default 64. 
        
                   Examples 
        
                   -------- 
        
                   MatrixShapesStrategy([("n", "k"), ("k", "m")]) 
        
                   MatrixShapesStrategy([("j", 3), (3, "j")]) 
        
               """ 
        
               def __init__( 
        
                   self, 
        
                   shapes, 
        
                   min_side=1, 
        
                   max_side=64, 
        
               ): 
        
                   super().__init__() 
        
                   self.side_strat = st.integers(min_side, max_side) 
        
                   self.shapes = tuple(shapes) 
        
                   self.min_side = min_side 
        
                   self.max_side = max_side 
        
               def do_draw(self, data): 
        
                   dims = {} 
        
                   shapes = [] 
        
                   for shape in self.shapes: 
        
                       shapes.append([]) 
        
                       for name in shape: 
        
                           if isinstance(name, int): 
        
                               shapes[-1].append(name) 
        
                               continue 
        
                           if name not in dims: 
        
                               dim = name 
        
                               dims[dim] = data.draw(self.side_strat) 
        
                           shapes[-1].append(dims[name]) 
        
                   shapes = tuple(tuple(s) for s in shapes) 
        
                   return shapes

)? Mostly I want to know whether I should have been able to do this with mutually_broadcastable_shapes.

I also wouldn't mind a quick check of qobj_shared_shapes (

qutip/qutip/tests/strategies.py

Lines 113 to 131 in 2aee70c

    
           def qobj_shared_shapes(shapes): 
        
               """ 
        
               Return a tuple of strategies that each return one shape from 
        
               a shared MatrixShapesStrategy strategy defined by the given shapes. 
        
               Examples 
        
               -------- 
        
               >>> shape_a, shape_b = qobj_shared_shapes([("n", "k"), ("k", "m")]) 
        
               In the example above shape_a would return the shapes generated for 
        
               ("n", "k") and shape_b the shapes generated for ("k", "m"). 
        
               """ 
        
               shapes_st = MatrixShapesStrategy(shapes) 
        
               shapes_st = st.shared(shapes_st) 
        
               import operator 
        
               return tuple( 
        
                   st.builds(operator.itemgetter(i), shapes_st) 
        
                   for i in range(len(shapes)) 
        
               )

). Is this a good thing to be doing with shared strategies?

hodgestar · 2022-08-07T20:00:40Z

Yes, you should use mutually_broadcastable_shapes(signature=) for that. The signature syntax is a little different, since Numpy does it with strings, but we already support named dimensions, constant-size dimensions, and optional dims. Should be a bit more efficient too.

I tried with signature= quite a bit but (m, k), (k, n) always produced thing like (1, 1, 1), (1, 1) despite setting max_dims and min_dims to various things.

It does technically work? But in such a situation I'd probably just use st.data() and draw in the body of the test; see hypothesis write numpy.matmul for an example. If you expected to use it in many tests maybe a custom strategy, but for just test_data_matmul_operator that doesn't seem worth it.

I expected there to be other use cases. I started with the output of write numpy.matmul but ended up playing around with my own strategy (see difficulties with mutually_broadcastable_shapesabove) and liking the idea of not have to unpackaandbinside the test, and of being able to supply the shapes foraandb` to other strategies (e.g. maybe we will have a strategy for Qobj and QobjEvo objects and want to try various combinations of multiplying things together)

Zac-HD · 2022-08-07T23:49:37Z

I tried with signature= quite a bit but (m, k), (k, n) always produced thing like (1, 1, 1), (1, 1) despite setting max_dims and min_dims to various things.

Huh, that's surprising. Share some example code and I'll have a look?

I expected there to be other use cases. I started with the output of write numpy.matmul but ended up playing around with my own strategy (see difficulties with mutually_broadcastable_shapesabove) and liking the idea of not have to unpackaandbinside the test, and of being able to supply the shapes foraandb` to other strategies (e.g. maybe we will have a strategy for Qobj and QobjEvo objects and want to try various combinations of multiplying things together)

👍 sounds good, carry on then!

hodgestar · 2022-08-08T14:53:26Z

@Ericgig and I had a brief discussion about how to handle inf and nan generally. One option is forbid them entirely from the data layer -- i.e. to check and raise an error whenever a new data layer is created. A potential downside to this is we'd have to check often to be sure, since any operation can potentially create nan and inf, even simple addition:

>>> 1e308 + 1e308
inf
>>> (1e308 + 1e308) - (1e308 + 1e308)
nan

an alternative is to try propagate nan and inf only as correctly as, e.g., BLAS. What happens with BLAS is that nan and inf propagate reasonably correctly, but are considered fairly interchangeable -- i.e. nan, inf, and -inf are all just bad values.

Zac-HD · 2022-08-08T19:03:41Z

(out of my wheelhouse again; @ me again if that changes (again!))

Ericgig

Hypothesis still feel somewhat magic, but I am starting to understand it.
Right now it seems to create data with mostly full arrays. I don't know how hard it is to create a strategy for different densities and to add the flags for Hermitian, etc. but it will be needed eventually.

The tolerances used are too tight, making some tests fail.

Also, from the old scipy test:

>>> a = np.array([[7.59109827e+14+9.96089747e+14j]])
>>> b = np.array([[7.22335902e+307+1.64618697e+308j]])
>>> np.allclose(b, a), np.allclose(a, b)
(False, True)

I am not sure we should ignore the overflow warning...

qutip/tests/core/data/test_operators_hypothesis.py

Ericgig · 2022-08-17T21:25:18Z

qutip/tests/core/data/test_operators_hypothesis.py

+
+@given(qst.qobj_datas(shape=same_shape), qst.qobj_datas(shape=same_shape))
+def test_data_equality_operator_same_shapes(a, b):
+    result = (a == b)


Here, qutip uses atol=1e-12, rtol=0 per default.

qutip/tests/core/data/test_operators_hypothesis.py

qutip/tests/strategies.py

…perators

…ators

…ularly challenging for floating point precision.

…ic inputs.

…non-finite scalars, only test matmul on saner values, make reference equality check for similar to what the data layer does, clean up tolerances.

… message and simply test_trace using raises_when.

hodgestar · 2023-04-12T22:11:09Z

@Ericgig I've expanded the tests and reworked this so that it's less of a WIP. I've also documented the expected behaviour of the data layer in the presence of nans, etc in the description of the PR. I'll find somewhere in the docs to add that before merging the PR.

Would you mind taking another look and letting me know what you think?

P.S. I also haven't had a clean test run on GitHub Actions yet, but they pass consistently on my machine now. I think I cleaned up most of the issues from the recent GitHub Actions runs in my recent commits, but I'll know in the morning. Obviously CI needs to pass consistently before merging could happen.

coveralls · 2023-04-12T22:33:25Z

Coverage: 75.624% (+0.3%) from 75.317% when pulling 133677f on hodgestar:feature/add-hypothesis-tests-for-data-operators into fb72696 on qutip:master.

Ericgig

The bug in the 3.9 tests seems a real bug with the new tests, it should be fixed.
If it's numpy version specific, we can officially drop that version. (1.20)

I see no mention of tidyup, some of the tested operation use it, should it not be turned off for these tests? Otherwise it will cause random fails eventually.

What is the plan for other functions?
Most of the common function are tested, but some often used are missing (kron, l2, isherm, expect).

I am also curious about some of the more complex functions (inv, expm, pow) would fare, but I don't expect them to play nice with this kind of test.

I disagree with making our code worst for common use case just to handle junk the same way others do.

hodgestar · 2023-04-13T15:08:46Z

The bug in the 3.9 tests seems a real bug with the new tests, it should be fixed. If it's numpy version specific, we can officially drop that version. (1.20)

Yes, I need to fix that.

I see no mention of tidyup, some of the tested operation use it, should it not be turned off for these tests? Otherwise it will cause random fails eventually.

Disabling tidyup during the tests is a good idea.

What is the plan for other functions? Most of the common function are tested, but some often used are missing (kron, l2, isherm, expect).

For this PR I just covered all the functions that are methods of the Data class. Some of these call out to dispatch functions, but I included those anyway since they're part of the Data class API.

I am also curious about some of the more complex functions (inv, expm, pow) would fare, but I don't expect them to play nice with this kind of test.

I suspect those will be a bit tricky, but I think we can still find a way to assert that they give the right answer even if it requires restricting a bit what examples hypothesis generates.

I disagree with making our code worst for common use case just to handle junk the same way others do.

I don't think I did this anywhere now? In some cases what QuTiP was doing was just wrong. E.g. inf * CSR-zeros is just NaNs everywhere. The previous QuTiP answer was wrong. We also don't have to handle junk the same way, but we do have to handle junk in a way that is sensible and avoids silently propagating or swallowing errors. People are going to make mistakes, that set of people includes us as core QuTiP developers.

Ericgig · 2023-04-13T15:55:15Z

qutip/core/data/mul.pyx

-    if value == 0:
-        return csr.zeros(matrix.shape[0], matrix.shape[1])


I hate the removal of this shortcut.
mul(data, 0) can happen often in QobjEvo when the coefficient is a pulse or step function.
Without this shortcut, they need to loop over the values twice, once for this mul, once for the add or matmul following.
Thus this shortcut will actually help users.
It is also one of the only places where there is a mul(data, 0) in the code.

In some theoretical case, it can shallow a NaN, but I don't see any realistic case where this could hide an error.

So everyone loses for a case that will probably never happen...

Can you write a code example where this would save us? Something that looks like it should work, but cause infinities or NaN that would be erased by a multiplication by 0 without returning garbage.

If we can't even think of such a case, let's keep the shortcut.
If you can find one, let's loop over the values to find Nans and return the zeros crs if none are found so we at least loop only once over the values.

Maybe we just need to make _data.zeros[type(x)](*x.shape) easier to use so that we don't need to write mul(data, 0)? I can add _data.zeros_like(x) if that would be better?

In the specific case where mul(data, 0) is intended to create a matrix of zeros of the right shape, the current fast path can never be wrong since that is exactly what it does. In general, mul(data, 0) is currently just incorrect floating point arithmetic.

I was wondering about adding has_non_finite flag to CSRs. It will be a bit of pain to keep track of everywhere but it would allow us to fast path the all finite cases without sacrificing correctness.

If this is a blocker for this PR, I'm happy to undo the current change and just make the tests pass by just explicitly stating that the expected result is 0 in those cases in the test.

I also don't want to sacrifice performance, but it is important to have a clear set of tests that show how we expect the data layer to behave in these edge cases.

I am more thinking about the use in QobjEvo. I don't think we ever use it as _data.zeros_like, in the code base, but that function would be useful, as would be an identity_like as we discussed somewhere.

I would be fine with a has_non_finite flag, but I think we should raise an error anytime we would raise that flag.
I think that we could do the check when we tidyup: we already check the scale of all values and it's done in most operation that could create zeros, thus functions at risk.

I see a CSR matrix full of zeros as wrong, if not an error, so this is a blocker to me.
This and tests should pass reliably.

I am using mul as _data.zeros_like in stochastic...

…checks. QuTiP's implementation does not use rtol.

hodgestar added 4 commits July 16, 2022 11:07

Add hypothesis strategies for qutip data objects.

b9bf808

Add hypothesis test for operators.

b7aea8b

Ignore hypothesis folder.

2a22493

Add hypothesis to test dependencies.

cec89b9

hodgestar requested a review from Ericgig July 16, 2022 16:15

Zac-HD mentioned this pull request Jul 16, 2022

🏃 Sprints meta-issue HypothesisWorks/hypothesis#3402

Closed

20 tasks

Zac-HD reviewed Jul 16, 2022

View reviewed changes

qutip/tests/core/data/test_operators_hypothesis.py Outdated Show resolved Hide resolved

qutip/tests/core/data/test_operators_hypothesis.py Outdated Show resolved Hide resolved

AGaliciaMartinez reviewed Jul 21, 2022

View reviewed changes

qutip/tests/core/data/test_operators_hypothesis.py Outdated Show resolved Hide resolved

hodgestar added 6 commits August 6, 2022 13:09

Add assert_allclose and note helpers. Adjust qobj_np strategy to corr…

da1ca97

…ectly shape 1d arrays.

Update tests to use new strategy helpers.

57f0ee7

Add dtype information to notes.

c85c38c

Add utilities for treating infs as nans and silencing numpy numerical…

1d74da8

… errors associated with these.

In addition test, ignore arithmetic warnings and treat inf as nans.

9a78395

In subtraction test, ignore arithmetic warnings and treat inf as nans.

48ca9ab

hodgestar added 12 commits August 7, 2022 00:51

Update remaining tests to strategy helpers and add test for iszero.

486b9db

Fix bug in iszero that caused nan's to be considered zero.

a5b6120

Remove incorrect fast path in mul_csr.

4a4f59c

Add a strategy for generating suitably shaped datas.

0e80dbf

Add a test for comparing the equality of differently shaped data.

38b4130

Use qobj_shaped_datas to correctly specify the shapes of datas for ma…

81a5a64

…tmul.

Move utilities to the end of the strategies file.

ea6ed3a

Add a nicer shared-strategy based helper for generating compatible sh…

4fc01f2

…apes.

Remove debugging print.

d55ebd0

Fix atol used for expected result in iszero tests.

c4f0d35

Slightly improve formatting of Qobj and Data notes.

67aa5cf

Slightly loosen absolute tolerance on matmul test.

2aee70c

Ericgig mentioned this pull request Aug 17, 2022

NaN management in Qobj #1977

Open

Ericgig reviewed Aug 17, 2022

View reviewed changes

hodgestar added 4 commits August 19, 2022 15:53

Merge branch 'dev.major' into feature/add-hypothesis-tests-for-data-o…

78970fe

…perators

Set tolerance when testing iszero.

7379dfc

Merge branch 'master' into feature/add-hypothesis-tests-for-data-oper…

acf138f

…ators

Merge branch 'master' into feature/add-hypothesis-tests-for-data-oper…

e0b8d5e

…ators

hodgestar changed the base branch from dev.major to master April 11, 2023 14:33

hodgestar added 13 commits April 11, 2023 17:33

Add missing 'except *' to Data.trace methods.

888c411

Fix typo and remove TODO.

7ab5dce

Add tests for trace, adjoint, conj, transpose and copy.

b48f786

Tighten up some tolerance settings.

8de0220

Add a strategy specifically for use in testing matmul which is partic…

bda1ed2

…ularly challenging for floating point precision.

Add decorator for asserting that a test raises an exception on specif…

3394850

…ic inputs.

Allow CSR scalar multiplication and division to raise exceptions for …

ed8d589

…non-finite scalars, only test matmul on saner values, make reference equality check for similar to what the data layer does, clean up tolerances.

Raise errors when multiplying CSR matrices by non-finite scalars.

99c9310

Add towncrier entry.

0313a6f

Add a link to the documentation on Hypothesis' note function.

1d56725

Shorten overly long docstring line.

0835021

Wrap numpy .trace() with ignore_arithmetic_warnings to avoid overflow…

5d6c0b4

… message and simply test_trace using raises_when.

Use numpy.zeros(a.shape) instead of numpy.zeros_like(a.to_array()).

cb9f849

Ericgig requested changes Apr 13, 2023

View reviewed changes

Ericgig reviewed Apr 13, 2023

View reviewed changes

Explicitly set rtol to 0 when determining expected value of equality …

133677f

…checks. QuTiP's implementation does not use rtol.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hypothesis tests for data operators. #1957

Add hypothesis tests for data operators. #1957

hodgestar commented Jul 16, 2022 •

edited

hodgestar commented Jul 16, 2022

Zac-HD left a comment

AGaliciaMartinez left a comment

Zac-HD commented Aug 6, 2022

hodgestar commented Aug 7, 2022

hodgestar commented Aug 7, 2022

hodgestar commented Aug 7, 2022

Zac-HD commented Aug 7, 2022

hodgestar commented Aug 8, 2022 •

edited

Zac-HD commented Aug 8, 2022

Ericgig left a comment

Ericgig Aug 17, 2022

hodgestar commented Apr 12, 2023 •

edited

coveralls commented Apr 12, 2023 •

edited

Ericgig left a comment

hodgestar commented Apr 13, 2023

Ericgig Apr 13, 2023

hodgestar Apr 13, 2023

Ericgig Apr 13, 2023

Ericgig Apr 13, 2023

		if value == 0:
		return csr.zeros(matrix.shape[0], matrix.shape[1])

Add hypothesis tests for data operators. #1957

Are you sure you want to change the base?

Add hypothesis tests for data operators. #1957

Conversation

hodgestar commented Jul 16, 2022 • edited

hodgestar commented Jul 16, 2022

Zac-HD left a comment

Choose a reason for hiding this comment

AGaliciaMartinez left a comment

Choose a reason for hiding this comment

Zac-HD commented Aug 6, 2022

hodgestar commented Aug 7, 2022

hodgestar commented Aug 7, 2022

hodgestar commented Aug 7, 2022

Zac-HD commented Aug 7, 2022

hodgestar commented Aug 8, 2022 • edited

Zac-HD commented Aug 8, 2022

Ericgig left a comment

Choose a reason for hiding this comment

Ericgig Aug 17, 2022

Choose a reason for hiding this comment

hodgestar commented Apr 12, 2023 • edited

coveralls commented Apr 12, 2023 • edited

Ericgig left a comment

Choose a reason for hiding this comment

hodgestar commented Apr 13, 2023

Ericgig Apr 13, 2023

Choose a reason for hiding this comment

hodgestar Apr 13, 2023

Choose a reason for hiding this comment

Ericgig Apr 13, 2023

Choose a reason for hiding this comment

Ericgig Apr 13, 2023

Choose a reason for hiding this comment

hodgestar commented Jul 16, 2022 •

edited

hodgestar commented Aug 8, 2022 •

edited

hodgestar commented Apr 12, 2023 •

edited

coveralls commented Apr 12, 2023 •

edited