add a gpu scaling job with diagnostics #2852

szy21 · 2024-03-28T18:19:51Z

Purpose

To-do

Content

I have read and checked the items on the review checklist.

szy21 · 2024-03-28T20:24:08Z

gpu build: https://buildkite.com/clima/climaatmos-target-gpu-simulations/builds/251

The simulation runs for a day, with daily averaged default output. The SYPD at the end of the simulation with and without diagnostics are 0.54 and 0.46, respectively. The SYPD during the time stepping is similar. @Sbozzolo @charleskawczynski Do you think it's useful to add a job like this in the gpu scaling pipeline?

Sbozzolo · 2024-03-28T20:46:49Z

In #2646, I was trying to add a job like this, but also producing the flame graph, so that we have an actionable table of parts to optimize. However, I am running into limits for ProfileCanvas, and the HTML cannot be rendered because it has too many entries.

The difference in SYPD during runtime tells us that the online SYPD is not computed correctly for the last step. The SYPD is computed at the beginning of the step, so it does not account for the time spent in saving the output in the last step.

Is this job representative the ideal job you want to run? Does it have all the physics you want to run and diagnostics you want to save?

szy21 · 2024-03-28T21:06:08Z

This job is a good representative for the atmosphere only without edmf (dyamond) run. The GPU scaling jobs only run for 1 day, with 1 day averaged diagnostics. I think in the end we want to run it with mostly monthly averaged diagnostics, so there may be some differences. But other than that I think this job is good.

Sbozzolo · 2024-03-28T21:58:24Z

Okay, I got a flame graph for this entre job, but it doesn't look good. I'll look into it

charleskawczynski · 2024-04-03T23:28:44Z

Do you think it's useful to add a job like this in the gpu scaling pipeline?

Yes, I think it'd be good. It'd be helpful if we add an nsysreport, too.

add a gpu scaling job with diagnostics

25881ba

szy21 marked this pull request as draft March 28, 2024 18:20

szy21 marked this pull request as ready for review March 28, 2024 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a gpu scaling job with diagnostics #2852

add a gpu scaling job with diagnostics #2852

szy21 commented Mar 28, 2024

szy21 commented Mar 28, 2024

Sbozzolo commented Mar 28, 2024

szy21 commented Mar 28, 2024

Sbozzolo commented Mar 28, 2024

charleskawczynski commented Apr 3, 2024

add a gpu scaling job with diagnostics #2852

Are you sure you want to change the base?

add a gpu scaling job with diagnostics #2852

Conversation

szy21 commented Mar 28, 2024

Purpose

To-do

Content

szy21 commented Mar 28, 2024

Sbozzolo commented Mar 28, 2024

szy21 commented Mar 28, 2024

Sbozzolo commented Mar 28, 2024

charleskawczynski commented Apr 3, 2024