Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verolero86 ace #761

Open
wants to merge 6 commits into
base: verolero86-ace
Choose a base branch
from
Open

Conversation

todbadrakh
Copy link
Member

Added Defiant quick-start guide to ACE Testbed with the associated changes in index.rst links:

  • new node diagram
  • changed to correct amd compiler paths and modules
  • fixed amd openmp offload flag example (--target)

Need info on:

  • file system and directory structure
  • slurm queues (batch-cpu and batch-gpu) and their policies

Copy link
Contributor

@secondspass secondspass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Togo, this is Subil. Thank you for working on the docs. I've left a few comments to check after going through it. Let me know if I can be of any help.


.. note::

Setting ``MPICH_SMP_SINGLE_COPY_MODE=CMA`` is required as a temporary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is referencing a known issue that seems to be resolved. It might've been something that was present in spock. Should verify its still an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue is resolved and we are at MPICH version 8.1.27. Removing this.

module load amd

## These must be set before running
export MPIR_CVAR_GPU_EAGER_DEVICE_MEM=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these env variables still necessary to use GPU aware MPI? I'm not 100% sure that GPU aware MPI even works correctly on Defiant yet when I tried, but I might be doing it wrong. That's something to verify.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving for now until confirmed.

@@ -46,6 +46,7 @@ Systems
:maxdepth: 2

systems/index
ace_testbed/index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one and the index item in line 106 below is causing the link to the ACE testbed doc to appear twice on the sidebar. There should only be one (though I don't know which one would be better). My vote is to keep the one on line 106 and delete this one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to delete this one at line 49 after adding line 106. Removing line 49.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants