Tests are too slow for libraries with eager-mode dispatch overhead #197

jakevdp · 2023-11-09T16:42:35Z

For example this job: https://github.com/google/jax/actions/runs/6804780902/job/18502996983, which is associated with google/jax#16099 took 4.5 hours to run about 1000 tests.

========== 567 failed, 439 passed, 116 skipped in 16419.47s (4:33:39) ==========

This is slow enough to be virtually unusable.

~~Is there anything I can do to speed up the tests during development of the JAX array API?~~

Edit: as mentioned below, this is due to the frequent use of patterns within the tests that every value in each array via indexing, and the fact that in eager execution, each of these indexing operations have a small dispatch overhead that leads to slow tests when arrays are large.

The text was updated successfully, but these errors were encountered:

asmeurer · 2023-11-09T19:57:20Z

They shouldn't be that slow. The NumPy tests finish in a fraction of that time https://github.com/data-apis/array-api-compat/actions/runs/6565885145/job/17835396340. It might be worth investigating what is going on.

Although even there, the NumPy tests took an hour to run, which seems like a lot. We should investigate if something slowed down recently.

Generally, though, you can speed things up by lowering the hypothesis --max-examples. By default it is 100, but something like 50 should make the tests run in roughly half the time.

asmeurer · 2023-11-09T20:11:55Z

I just ran ARRAY_API_TESTS_MODULE=numpy.array_api pytest array_api_tests/ locally and it took 12 minutes. So that's the about time the test should run in (there were a few failures which speed up the runtime a bit, but in general it shouldn't take more than 20 minutes).

asmeurer · 2023-11-09T20:13:41Z

Can you run something like pytest --durations=10 to print the 10 slowest tests? That would help to pin down what is going on.

jakevdp · 2023-11-09T20:18:05Z

I think the slowness comes from test patterns that look like this:

array-api-tests/array_api_tests/test_creation_functions.py

Lines 358 to 364 in f82c7bc

    
           for i in range(n_rows): 
        
               for j in range(_n_cols): 
        
                   f_indexed_out = f"out[{i}, {j}]={out[i, j]}" 
        
                   if j - i == kw.get("k", 0): 
        
                       assert out[i, j] == 1, f"{f_indexed_out}, should be 1 {f_func}" 
        
                   else: 
        
                       assert out[i, j] == 0, f"{f_indexed_out}, should be 0 {f_func}"

In JAX, each operation outside the context of JIT compilation has a small amount of overhead related to dispatch and device placement for the output, so running $\mathcal{O}[N^2]$ indexing operations in a loop will accumulate that overhead and be very slow.

asmeurer · 2023-11-09T20:44:44Z

There's a variable that limits the max array size

array-api-tests/array_api_tests/hypothesis_helpers.py

Line 167 in 37bbb58

MAX_ARRAY_SIZE = 10000

. The default is 10000 but we should make it configurable. Lowering it to 1000 or so for JAX would probably fix this.

jakevdp · 2023-11-09T23:59:48Z

I pasted the slowest 20 tests below.

For the most part I think it comes down to the slow repeated indexing I mentioned above: every one of these that I've checked indexes each element of the output to check it against a reference implementation.

54.85s call     array_api_tests/test_creation_functions.py::test_linspace
35.72s call     array_api_tests/test_creation_functions.py::test_eye
25.87s call     array_api_tests/test_operators_and_elementwise_functions.py::test_bitwise_left_shift[__lshift__(x, s)]
21.24s call     array_api_tests/test_operators_and_elementwise_functions.py::test_subtract[subtract(x1, x2)]
20.68s call     array_api_tests/test_operators_and_elementwise_functions.py::test_greater[greater(x1, x2)]
20.64s call     array_api_tests/test_operators_and_elementwise_functions.py::test_bitwise_and[bitwise_and(x1, x2)]
19.93s call     array_api_tests/test_operators_and_elementwise_functions.py::test_not_equal[__ne__(x, s)]
19.88s call     array_api_tests/test_operators_and_elementwise_functions.py::test_not_equal[not_equal(x1, x2)]
19.72s call     array_api_tests/test_operators_and_elementwise_functions.py::test_less_equal[less_equal(x1, x2)]
19.59s call     array_api_tests/test_operators_and_elementwise_functions.py::test_less[less(x1, x2)]
18.77s call     array_api_tests/test_operators_and_elementwise_functions.py::test_bitwise_or[bitwise_or(x1, x2)]
18.14s call     array_api_tests/test_operators_and_elementwise_functions.py::test_greater_equal[greater_equal(x1, x2)]
17.24s call     array_api_tests/test_operators_and_elementwise_functions.py::test_bitwise_left_shift[bitwise_left_shift(x1, x2)]
16.71s call     array_api_tests/test_operators_and_elementwise_functions.py::test_divide[divide(x1, x2)]
14.99s call     array_api_tests/test_operators_and_elementwise_functions.py::test_multiply[multiply(x1, x2)]
13.70s call     array_api_tests/test_operators_and_elementwise_functions.py::test_bitwise_xor[bitwise_xor(x1, x2)]
13.69s call     array_api_tests/test_operators_and_elementwise_functions.py::test_logical_or
13.46s call     array_api_tests/test_operators_and_elementwise_functions.py::test_atan2
13.00s call     array_api_tests/test_operators_and_elementwise_functions.py::test_add[add(x1, x2)]

jakevdp · 2023-11-10T19:27:02Z

Would you be open to changing logic that looks like this:

array-api-tests/array_api_tests/test_creation_functions.py

Lines 358 to 364 in f82c7bc

    
           for i in range(n_rows): 
        
               for j in range(_n_cols): 
        
                   f_indexed_out = f"out[{i}, {j}]={out[i, j]}" 
        
                   if j - i == kw.get("k", 0): 
        
                       assert out[i, j] == 1, f"{f_indexed_out}, should be 1 {f_func}" 
        
                   else: 
        
                       assert out[i, j] == 0, f"{f_indexed_out}, should be 0 {f_func}"

to something more like this?

k = kw.get("k", 0)
expected = [[1 if j - i == k else 0 for j in range(_n_cols)] for i in range(n_rows)]
assert xp.all(xp.asarray(expected) == out)

It does depend on asarray and all working properly, but the previous approach depends on indexing working properly. You lose the granularity of the error, but that could be addressed using where when generating the error message.

Rewriting tests this way would make the test suite usable for JAX and other libraries that have non-negligible dispatch overhead in eager mode.

I would be happy to prepare a PR if you think this is the right direction for the testing suite.

asmeurer · 2023-11-10T21:02:23Z

That would work for the creation functions like eye, but how would you change the elementwise function tests? For example, test_bitwise_left_shift (one of your slowest tests):

array-api-tests/array_api_tests/test_operators_and_elementwise_functions.py

Lines 822 to 837 in f82c7bc

    
           def test_bitwise_left_shift(ctx, data): 
        
               left = data.draw(ctx.left_strat, label=ctx.left_sym) 
        
               right = data.draw(ctx.right_strat, label=ctx.right_sym) 
        
               if ctx.right_is_scalar: 
        
                   assume(right >= 0) 
        
               else: 
        
                   assume(not xp.any(ah.isnegative(right))) 
        
               res = ctx.func(left, right) 
        
               binary_param_assert_dtype(ctx, left, right, res) 
        
               binary_param_assert_shape(ctx, left, right, res) 
        
               nbits = res.dtype 
        
               binary_param_assert_against_refimpl( 
        
                   ctx, left, right, res, "<<", lambda l, r: l << r if r < nbits else 0 
        
               )

The loop is done in this helper

array-api-tests/array_api_tests/test_operators_and_elementwise_functions.py

Line 311 in f82c7bc

for l_idx, r_idx, o_idx in sh.iter_indices(left.shape, right.shape, res.shape):

You have to loop through the input arrays to generate the exact outputs. Maybe there's some way to extract the original input array list from hypothesis (assuming it didn't go through any further transformations after being converted to an array). @honno, thoughts?

Anyway, I think changing it to be like that for the creation functions is fine.

asmeurer · 2023-11-10T21:04:12Z

Anyway, I think the best solution for you is to make MAX_ARRAY_SIZE configurable. I expect 1000 (or even 100) would be fine for most test cases, but would drop the runtime of those tests down correspondingly.

jakevdp · 2023-11-10T22:02:21Z

Another option would be to use dlpack to export the full array to numpy or some other standard, where eager-mode indexing is not problematic.

rgommers · 2023-11-14T10:40:07Z

I did a bit of profiling and testing to figure out what's going on. The result of:

$ py-spy record -o profile.svg -- pytest array_api_tests/

is this:

That takes about 15 minutes, and ~75% of the time is spent in test_special_cases.py. Special cases are really not of interest to start with when developing an array API implementation, so the next step bash to add:

--ignore=array_api_tests/test_special_cases.py

That brings it down to ~4 minutes. With NumPy 1.26.0, there are still 4 failures when running on numpy.array_api. These 4 failures can be reproduced also with

--max-examples=1

and that runs in about 3 seconds. Adding back the special cases tests makes it run in 18 seconds with --max-examples=1.

Right now we are in the position that we need this 3 second run to pass on libraries. That is far more important than anything else; there are, as of now, zero implementations that actually pass 100%. The array_api_compat layer is missing array methods like .mT and .to_device for numpy and shows casting rule issues (as expected until numpy 2.0); plain numpy.array_api comes closest but still needs a couple of fixes.

It looks to me like we need to focus on that. And in addition, make some of the extra-costly checks for JAX vectorized, as we're discussing in gh-200 right now.

jakevdp · 2023-11-14T19:12:17Z

Thanks - with --max-examples=5 and a number of skips related to mutation and other issues, the jax.experimental.array_api PR now passes the test suite! google/jax#16099

I did find that it errors with this week's hypothesis 6.88.4 release; perhaps I should file a bug for that separately.

rgommers · 2023-11-14T19:17:52Z

Awesome, that's great to see!

jakevdp changed the title ~~Tests are too slow...~~ Tests are too slow for libraries with eager-mode dispatch overhead Nov 10, 2023

jakevdp mentioned this issue Nov 10, 2023

Make test_eye more efficient #200

Closed

honno mentioned this issue Jan 16, 2024

DeadlineExceeded failures in test_eye/test_linspace #214

Open

jakevdp mentioned this issue Apr 22, 2024

ENH: array types: add JAX support scipy/scipy#20085

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests are too slow for libraries with eager-mode dispatch overhead #197

Tests are too slow for libraries with eager-mode dispatch overhead #197

jakevdp commented Nov 9, 2023 •

edited

asmeurer commented Nov 9, 2023

asmeurer commented Nov 9, 2023

asmeurer commented Nov 9, 2023

jakevdp commented Nov 9, 2023 •

edited

asmeurer commented Nov 9, 2023 •

edited

jakevdp commented Nov 9, 2023 •

edited

jakevdp commented Nov 10, 2023 •

edited

asmeurer commented Nov 10, 2023

asmeurer commented Nov 10, 2023

jakevdp commented Nov 10, 2023

rgommers commented Nov 14, 2023

jakevdp commented Nov 14, 2023

rgommers commented Nov 14, 2023

Tests are too slow for libraries with eager-mode dispatch overhead #197

Tests are too slow for libraries with eager-mode dispatch overhead #197

Comments

jakevdp commented Nov 9, 2023 • edited

asmeurer commented Nov 9, 2023

asmeurer commented Nov 9, 2023

asmeurer commented Nov 9, 2023

jakevdp commented Nov 9, 2023 • edited

asmeurer commented Nov 9, 2023 • edited

jakevdp commented Nov 9, 2023 • edited

jakevdp commented Nov 10, 2023 • edited

asmeurer commented Nov 10, 2023

asmeurer commented Nov 10, 2023

jakevdp commented Nov 10, 2023

rgommers commented Nov 14, 2023

jakevdp commented Nov 14, 2023

rgommers commented Nov 14, 2023

jakevdp commented Nov 9, 2023 •

edited

jakevdp commented Nov 9, 2023 •

edited

asmeurer commented Nov 9, 2023 •

edited

jakevdp commented Nov 9, 2023 •

edited

jakevdp commented Nov 10, 2023 •

edited