You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 20, 2023. It is now read-only.
Some environment variables are hard-coded for intel mpi (on supported VMs HC and HB). I ran in this issue running MPI jobs on 2 nodes with intel mpi 2020, where mlx does not seem to work but verbs does for FI_PROVIDERS but I had to go change it in batch.py to make it work.
Batch Shipyard Version
Latest
Steps to Reproduce
Set FI_PROVIDERS or I_MPI_FABRICS as an environment variable in the jobs.yaml
Expected Results
Set env variables to what is defined in jobs.yaml
Actual Results
Defaults from batch.py line 4409 and 4410
Redacted Configuration
INSERT RELEVANT YAML FILES
Additional Logs
INSERT ADDITIONAL LOGS HERE
Additonal Comments
Haven't find where to get it but these liens (and maybe for other versions too) may be better replace by an `ib_env[...] = env.get(user) or default (pseudo-code)
The text was updated successfully, but these errors were encountered:
Yes I do have MLNX_OFED installed from Mellanox. I got it to run ith some UCX fallback variables, but I think it would be nice to be able to set these environment variables differently rather than having to debug the hardcoded shm:ofi+mlx
Problem Description
Some environment variables are hard-coded for intel mpi (on supported VMs HC and HB). I ran in this issue running MPI jobs on 2 nodes with intel mpi 2020, where
mlx
does not seem to work butverbs
does forFI_PROVIDERS
but I had to go change it inbatch.py
to make it work.Batch Shipyard Version
Latest
Steps to Reproduce
Set
FI_PROVIDERS
orI_MPI_FABRICS
as an environment variable in thejobs.yaml
Expected Results
Set env variables to what is defined in
jobs.yaml
Actual Results
Defaults from
batch.py
line 4409 and 4410Redacted Configuration
Additional Logs
Additonal Comments
Haven't find where to get it but these liens (and maybe for other versions too) may be better replace by an `ib_env[...] = env.get(user) or default (pseudo-code)
The text was updated successfully, but these errors were encountered: