LLNL GitLab-CI #747

jedbrown · 2021-04-17T00:00:05Z

For those with LLNL CZ access, I created a mirror repository that will be able to run CI jobs. You can log in and request access if you don't already have.

https://lc.llnl.gov/gitlab/ceed/libceed (requires 2FA)

Relevant documentation:

The LLNL system is set up to allow "LGTM" comments by a trusted member to launch a pipeline with jobs on LLNL machines (including quart, lassen, and corona). Some MFEM team members have experience with this setup. I don't think we need to use it for all PRs, but it'd be wonderful to set this up so it's easy to use for PRs that look like they would benefit. I'd envision using the batch executor on a single node, ideally with either MFEM or PETSc to test multi-GPU solvers. This could include short-running performance tests for longitudinal tracking.

adrienbernede · 2021-04-17T04:02:36Z

@jedbrown For your information: https://radiuss-ci.readthedocs.io/en/latest/
This documents a method to implement the CI for mirrored repo on LC using Uberenv to automate the installation of dependencies through Spack.
Whatever the workflow you choose, I can definitely help with the setup.

v-dobrev · 2021-04-19T19:46:03Z

The libceed repo already has .gitlab-ci.yml with a gitlab CI setup for use outside the LLNL gitlab instance. Is there a way to distinguish what gets run on which gitlab instance?

adrienbernede · 2021-04-19T19:49:05Z

Yes there is. We can create "rules" that will distinguish based on the server name for example: CI_SERVER_HOST.

jedbrown · 2021-04-19T19:49:21Z

I think I would prefer the first of these two options because all content stays in the repository.
https://ecp-ci.gitlab.io/docs/guides/multi-gitlab-project.html

adrienbernede · 2021-04-19T19:51:47Z

This ECP documentation is really good, and will save me considerable amount of time answering the same questions here on the lab internal docs!

jeremylt · 2021-04-19T19:57:27Z

I like the first of those two options as well. I could see it being useful to split the scripts for each CI instance into separate files that get referenced in .gitlab-ci.yml though, for readability.

adrienbernede · 2021-04-19T20:00:41Z

@jeremylt, I am failing to see the nuance with the example given where each job is calling a different script.
Or maybe you meant that the yaml files should be split by instances (using the include keyword)?

jeremylt · 2021-04-19T20:10:12Z

I'm just commenting that I would find it easier to see what's going if all this from our current yaml

# Compilers
    - export COVERAGE=1 CC=gcc CXX=g++ FC=gfortran HIPCC=hipcc
    - echo "-------------- nproc ---------------" && NPROC_CPU=$(nproc) && NPROC_GPU=$(($(nproc)<8?$(nproc):8))
    - echo "-------------- CC ------------------" && $CC --version
    - echo "-------------- CXX -----------------" && $CXX --version
    - echo "-------------- FC ------------------" && $FC --version
    - echo "-------------- HIPCC ---------------" && $HIPCC --version
    - echo "-------------- GCOV ----------------" && gcov --version
# Libraries for backends
# -- MAGMA from dev branch
    - echo "-------------- MAGMA ---------------"
    - export MAGMA_DIR=/projects/hipMAGMA && git -C $MAGMA_DIR describe
# -- LIBXSMM v1.16.1
    - cd .. && export XSMM_VERSION=libxsmm-1.16.1 && { [[ -d $XSMM_VERSION ]] || { git clone --depth 1 --branch 1.16.1 https://github.com/hfp/libxsmm.git $XSMM_VERSION && make -C $XSMM_VERSION -j$(nproc); }; } && export XSMM_DIR=$PWD/$XSMM_VERSION && cd libCEED
    - echo "-------------- LIBXSMM -------------" && git -C $XSMM_DIR describe --tags
# -- OCCA v1.1.0
    - cd .. && export OCCA_VERSION=occa-1.1.0 OCCA_OPENCL_ENABLED=0 && { [[ -d $OCCA_VERSION ]] || { git clone --depth 1 --branch v1.1.0 https://github.com/libocca/occa.git $OCCA_VERSION && make -C $OCCA_VERSION -j$(nproc); }; } && export OCCA_DIR=$PWD/$OCCA_VERSION && cd libCEED
    - echo "-------------- OCCA ----------------" && make -C $OCCA_DIR info
# libCEED
    - make configure HIP_DIR=/opt/rocm OPT='-O -march=native -ffp-contract=fast'
    - BACKENDS_CPU=$(make info-backends | grep -o '/cpu[^ ]*') && BACKENDS_GPU=$(make info-backends | grep -o '/gpu[^ ]*')
    - echo "-------------- libCEED -------------" && make info
    - make -j$NPROC_CPU
# -- libCEED only tests
    - echo "-------------- core tests ----------"
    - echo '[{"subject":"/","metrics":[{"name":"Transfer Size (KB)","value":"19.5","desiredSize":"smaller"},{"name":"Speed Index","value":0,"desiredSize":"smaller"},{"name":"Total Score","value":92,"desiredSize":"larger"},{"name":"Requests","value":4,"desiredSize":"smaller"}]}]' > performance.json
    - make -k -j$NPROC_CPU BACKENDS="$BACKENDS_CPU" junit realsearch=%
    - make -k -j$NPROC_GPU BACKENDS="$BACKENDS_GPU" junit realsearch=%
# Libraries for examples
# -- PETSc with HIP (minimal)
#    Note: These tests don't run in ' make junit realsearch=%' until PETSC_DIR set
#          PETSC_DIR is not set by default in GitLab runner env
    - export PETSC_DIR=/projects/petsc PETSC_ARCH=mpich-hip && git -C $PETSC_DIR describe
    - echo "-------------- PETSc ---------------" && make -C $PETSC_DIR info
    - make -k -j$NPROC_CPU BACKENDS="$BACKENDS_CPU" junit search="petsc fluids solids"
    - make -k -j$NPROC_GPU BACKENDS="$BACKENDS_GPU" junit search="petsc fluids solids"
# -- MFEM v4.2
    - cd .. && export MFEM_VERSION=mfem-4.2 && { [[ -d $MFEM_VERSION ]] || { git clone --depth 1 --branch v4.2 https://github.com/mfem/mfem.git $MFEM_VERSION && make -C $MFEM_VERSION -j$(nproc) serial CXXFLAGS="-O -std=c++11"; }; } && export MFEM_DIR=$PWD/$MFEM_VERSION && cd libCEED
    - echo "-------------- MFEM ----------------" && make -C $MFEM_DIR info
    - make -k -j$NPROC_CPU BACKENDS="$BACKENDS_CPU" junit search=mfem
    - make -k -j$NPROC_GPU BACKENDS="$BACKENDS_GPU" junit search=mfem
# -- Nek5000 v19.0
    - export COVERAGE=0
    - cd .. && export NEK5K_VERSION=Nek5000-19.0 && { [[ -d $NEK5K_VERSION ]] || { git clone --depth 1 --branch v19.0 https://github.com/Nek5000/Nek5000.git $NEK5K_VERSION && cd $NEK5K_VERSION/tools && ./maketools genbox genmap reatore2 && cd ../..; }; } && export NEK5K_DIR=$PWD/$NEK5K_VERSION && export PATH=$NEK5K_DIR/bin:$PATH MPI=0 && cd libCEED
    - echo "-------------- Nek5000 -------------" && git -C $NEK5K_DIR describe --tags
    - make -k -j$NPROC_CPU BACKENDS="$BACKENDS_CPU" junit search=nek
    - make -k -j$NPROC_GPU BACKENDS="$BACKENDS_GPU" junit search=nek
# Clang-tidy
    - echo "-------------- clang-tidy ----------" && clang-tidy --version
    - TIDY_OPTS="-fix-errors" make -j$NPROC_CPU tidy && git diff --exit-code
# Report status
    - echo "SUCCESS" > .job_status
  after_script:
    - |
      if [ $(cat .job_status) == "SUCCESS" ]; then
        lcov --directory . --capture --output-file coverage.info;
        bash <(curl -s https://codecov.io/bash) -f coverage.info -t ${CODECOV_ACCESS_TOKEN} -F interface;
        bash <(curl -s https://codecov.io/bash) -f coverage.info -t ${CODECOV_ACCESS_TOKEN} -F gallery;
        bash <(curl -s https://codecov.io/bash) -f coverage.info -t ${CODECOV_ACCESS_TOKEN} -F backends;
        bash <(curl -s https://codecov.io/bash) -f coverage.info -t ${CODECOV_ACCESS_TOKEN} -F tests;
        bash <(curl -s https://codecov.io/bash) -f coverage.info -t ${CODECOV_ACCESS_TOKEN} -F examples;
      fi

would be put in a separate bash file (and the script for the new LLNL job put into its own bash file) if we run both CI jobs off of the same yaml.

Then we'd have

.rules-noether
  rules:
  - if: '$CI_SERVER_HOST == "noether.colorado.edu"'
    when: always

.rules-llnl
  rules:
  - if: '$CI_SERVER_HOST == "gitlab.llnl.gov"'
    when: always

noether-nocm:
  stage: test
  tags:
    - rocm
  image: jedbrown/rocm:latest
  extends: .rules-noether
  script:
    - ./bash-file-with-noether-script

llnl-cuda:
  stage: test
  tags:
    - cuda
  extends: .rules-llnl
  script:
    - ./bash-file-with-llnl-script

or something along those lines

adrienbernede · 2021-04-19T20:18:03Z

FYI, the first method presents the drawbacks that rules can be hard to stack.
So I would not write them the way this documentation suggest:

- if: '$CI_SERVER_HOST == "gitlab.anl.gov"'
    when: always

There are 2 inaccuracies here:

The default behavior is "on-success" not "always". Using always would make the job run all the time, ignoring any failure in previous stages. This is not a good default behavior, so it shouldn't be used in a generic example.
In order to make rules "stackable" it is better to say the following:

- if: '$CI_SERVER_HOST != "gitlab.anl.gov"'
    when: never

That is because, per documentation:

Rules are evaluated in order until a match is found.

adrienbernede · 2021-04-19T20:18:54Z

I don't see a link to the ECP CI documentation repo, but I would like to share those thoughts with the authors.

adrienbernede · 2021-04-19T20:21:52Z

@jeremylt I agree, putting scripts in a separate file is a good practice.

jedbrown · 2021-04-19T20:22:04Z

I think they'd love a merge request. https://gitlab.com/ecp-ci/ecp-ci.gitlab.io/-/blob/main/docs/guides/multi-gitlab-project.rst

jedbrown added GPU CUDA CI labels Apr 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLNL GitLab-CI #747

LLNL GitLab-CI #747

jedbrown commented Apr 17, 2021

adrienbernede commented Apr 17, 2021

v-dobrev commented Apr 19, 2021

adrienbernede commented Apr 19, 2021 •

edited

jedbrown commented Apr 19, 2021

adrienbernede commented Apr 19, 2021

jeremylt commented Apr 19, 2021

adrienbernede commented Apr 19, 2021 •

edited

jeremylt commented Apr 19, 2021

adrienbernede commented Apr 19, 2021

adrienbernede commented Apr 19, 2021

adrienbernede commented Apr 19, 2021 •

edited

jedbrown commented Apr 19, 2021

LLNL GitLab-CI #747

LLNL GitLab-CI #747

Comments

jedbrown commented Apr 17, 2021

adrienbernede commented Apr 17, 2021

v-dobrev commented Apr 19, 2021

adrienbernede commented Apr 19, 2021 • edited

jedbrown commented Apr 19, 2021

adrienbernede commented Apr 19, 2021

jeremylt commented Apr 19, 2021

adrienbernede commented Apr 19, 2021 • edited

jeremylt commented Apr 19, 2021

adrienbernede commented Apr 19, 2021

adrienbernede commented Apr 19, 2021

adrienbernede commented Apr 19, 2021 • edited

jedbrown commented Apr 19, 2021

adrienbernede commented Apr 19, 2021 •

edited

adrienbernede commented Apr 19, 2021 •

edited

adrienbernede commented Apr 19, 2021 •

edited