Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add miniapps as tests #1112

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from
Draft

Conversation

msimberg
Copy link
Collaborator

@msimberg msimberg commented Mar 19, 2024

Adds the algorithms miniapps as tests. Does not add miniapp_laset as it's not really testing anything too useful, and it messes with the check-threads setup (which checks if OS threads are being spawned that shouldn't be spawned). The miniapps are run with 6 ranks, --grid-rows 3 --grid-cols 2, and otherwise default options.

This splits DLAF_addTest into two separate functions: one which adds a test given a target (DLAF_addTargetTest) which is used to add the miniapps since they already have CMake targets, and DLAF_addTest which creates an executable and then calls DLAF_addTargetTest. The behaviour of DLAF_addTest remains unchanged.

This also adds a CATEGORY parameter that can be added to tests as a label. This is added alongside the RANK_* labels. The CATEGORY defaults to UNIT for "unit tests", and I set it to MINIAPP for the miniapps. I'm fine with making the choice explicit and/or renaming/adding the categories. I then use the RANK and CATEGORY labels to generate jobs the same way RANK is used currently on master. Combinations that have no tests are not added as CI jobs.

Finally, many miniapps were hanging on the CUDA configurations, where they run with only one worker thread and one MPI thread. I've changed the waitLocalTiles calls to pika::wait calls before MPI_Barrier is called to make sure that really nothing is running anymore and deadlocks are avoided. I've only changed them to pika::wait after the algorithms are called, but for consistency it might make sense to use pika::wait anywhere before MPI_Barrier is called?

@msimberg msimberg self-assigned this Mar 19, 2024
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg force-pushed the miniapps-as-tests branch 2 times, most recently from 7041343 to ed29663 Compare March 21, 2024 09:26
@msimberg
Copy link
Collaborator Author

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

msimberg commented Apr 8, 2024

cscs-ci run

@msimberg msimberg force-pushed the miniapps-as-tests branch 2 times, most recently from 6c39878 to 62b54e9 Compare April 9, 2024 08:02
@msimberg
Copy link
Collaborator Author

msimberg commented Apr 9, 2024

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

msimberg commented Apr 9, 2024

cscs-ci run

@msimberg
Copy link
Collaborator Author

msimberg commented Apr 9, 2024

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

Copy link
Collaborator

@rasolca rasolca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not happy about spreading pika::waits everywhere in the miniapps.
I'd prefer another solution.
Would scheduling and sync_waiting an Ibarrier work?

@msimberg
Copy link
Collaborator Author

msimberg commented May 3, 2024

Not happy about spreading pika::waits everywhere in the miniapps. I'd prefer another solution. Would scheduling and sync_waiting an Ibarrier work?

I don't know, but I can try. It would need IBarrier-ing all the communicators in that case. Edit: or would it?

@msimberg msimberg marked this pull request as draft May 13, 2024 13:03
@msimberg
Copy link
Collaborator Author

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

Comment on lines +163 to +167
for (std::size_t i = 0; i < comm_grid.num_pipelines(); ++i) {
sync_wait(comm_grid.full_communicator_pipeline().exclusive());
sync_wait(comm_grid.row_communicator_pipeline().exclusive());
sync_wait(comm_grid.col_communicator_pipeline().exclusive());
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for communicators does not strictly seem to be required in all the miniapps, but I would add it to all of them in any case. Whether or not it's required probably depends on the internal communication patterns.

In any case, the above snippet seems to work fine in CI now. I think we could either:

  • add a helper function available only for tests/miniapps with the signature void wait_all_communicators(CommunicatorGrid&) which does the above, or
  • add a member function, similar to Matrix::waitLocalTiles, to CommunicatorGrid which does the above

Do you have preferences? Other ideas? I think the latter is nice in terms of encapsulation, but I'm not sure we want to encourage users to use it (yet...?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

4 participants