Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test differentiation of Oceananigans broadcast kernel with Enzyme #3598

Open
wants to merge 47 commits into
base: main
Choose a base branch
from

Conversation

jlk9
Copy link
Collaborator

@jlk9 jlk9 commented May 10, 2024

There was a bug in some recent updates to KernelAbstractions.jl that caused Enzyme to break on broadcasting arrays in Oceananigans. This PR includes a test to make sure this bug doesn't occur again.

jlk9 and others added 30 commits February 14, 2024 15:44
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
Co-authored-by: Navid C. Constantinou <navidcy@users.noreply.github.com>
…o be pushed onto main! I'm just setting making sure we have a working test before I try changing to the vararg approach
…_free_surface_tendencies.jl with use of Varargs
test/test_enzyme.jl Outdated Show resolved Hide resolved
test/test_enzyme.jl Outdated Show resolved Hide resolved
test/test_enzyme.jl Outdated Show resolved Hide resolved
@glwagner
Copy link
Member

glwagner commented May 10, 2024

There was a bug in some recent updates to KernelAbstractions.jl that caused Enzyme to break on broadcasting arrays in Oceananigans. This PR includes a test to make sure this bug doesn't occur again.

Interesting! I think it's ok to add a broadcasting test. But it will be confusing to future developers if the test is explained / written as somehow testing a bug in another package. If there's a bug somewhere else, we need a test in that packge (presumably that has been added).

This test also seems a little complicated. Why not just write a simple function that does a broadcast, and then try to autodiff that? Why do we need initial conditions, models, etc?

For example

function times_c!(a, b, c)
    a .= b .* c # c is a number
    return sum(a) # or whatever we gotta return
end

grid = RectilinearGrid(arch, size=(1, 1, 1), extent=(1, 1, 1))
a = CenterField(grid)
b = CenterField(grid)
c = 2
@test try 
    autodiff(times_c!, a, b, c... # or something)
    true
catch
    false
end

It's super important for tests to be as short and easy to understand as possible, because maintaining test code is one of the main bottlenecks on development.

@wsmoses
Copy link
Collaborator

wsmoses commented May 11, 2024

Perhaps I can chime in here to give some context.

Something in the Oceananigans/KA/Enzyme/etc setup was breaking our integration test of the advection-diffusion (#3480) which was blocking us for making progress for some time.

Eventually Joe successfully minimized it down to this point as a minimal error (effectively just testing successful AD of Oceanigans.Utils.launch!, which was failing). We later determined the root cause of the issue to be a problem in KA (JuliaGPU/KernelAbstractions.jl#476).

The purpose of this is not to specifically act as a unit test for the individual KA issue, but to be a small unit test for Oceananigans modified launching infrastructure. That way if something else comes up as a bug in a future integration test, we can quickly find the root cause without weeks of debugging from the whole integration test.

@wsmoses
Copy link
Collaborator

wsmoses commented May 11, 2024

@glwagner I've modified the PR to make clear that this is providing unit tests of the Oceananigans setting utility/broadcast functionality, at at increasingly high level

@wsmoses wsmoses changed the title Test for 'Any' constructor bug with Enzyme Test differentiation of Oceananigans broadcast kernel with Enzyme May 11, 2024
@wsmoses
Copy link
Collaborator

wsmoses commented May 13, 2024

@glwagner can you give this another round of review?

@glwagner
Copy link
Member

Okay, it looks like we want to test set!(model, ...) so we need a model!

The tests are failing? Which maybe means they are working.

https://buildkite.com/clima/oceananigans/builds/15738#018f734d-a866-49c7-9114-390d6932cb34/18-355

@wsmoses
Copy link
Collaborator

wsmoses commented May 13, 2024

Or more specifically that we need to enfore an Enzyme version bump to 0.12.5 rather than the 0.11.20 it is on

@glwagner
Copy link
Member

word

@wsmoses
Copy link
Collaborator

wsmoses commented May 14, 2024

@glwagner I have no idea what the agent lost CI thing is

@glwagner
Copy link
Member

Enzyme + Oceananigans Initialization Broadcast Kernel: Error During Test at /var/lib/buildkite-agent/builds/tartarus-14/clima/oceananigans/test/test_enzyme.jl:59
--
  | Got exception outside of a @test
  | MethodError: no constructors have been defined for Any

🤔

https://buildkite.com/clima/oceananigans/builds/15740#018f756a-5b07-47c3-be29-9fc9b4d9d3ab/29-348

@wsmoses
Copy link
Collaborator

wsmoses commented May 14, 2024

Ah the classic (which this test is designed to catch). We also need to bump KA @glwagner if you would do the honors.

@glwagner
Copy link
Member

Ah the classic (which this test is designed to catch). We also need to bump KA @glwagner if you would do the honors.

sure thing -- to 0.9.19?

@wsmoses
Copy link
Collaborator

wsmoses commented May 14, 2024

Yeah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants