Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can you share the slurm.conf you are using? #37

Open
OhadRubin opened this issue Mar 27, 2022 · 3 comments
Open

can you share the slurm.conf you are using? #37

OhadRubin opened this issue Mar 27, 2022 · 3 comments

Comments

@OhadRubin
Copy link

Hey,
pinging @stas00
I'm a researcher from Tel-Aviv University and were thinking about implementing QOS, similar to what you have with the Jean Zay cluster.
It would be really helpful to see the slurm.conf you are using for your QOS setting.
Thanks!
Ohad

@RemiLacroix-IDRIS
Copy link

RemiLacroix-IDRIS commented Mar 28, 2022

Hi @OhadRubin,

The QoS settings are not defined in slurm.conf. What would you like to know exactly?

Rémi
IDRIS User Support Team

@OhadRubin
Copy link
Author

OhadRubin commented Mar 29, 2022

I would like to reproduce the QOS settings you have here:

--qos=qos_gpu-t3 20h / 512gpus (default priority)
--qos=qos_gpu-t4 100h / 16gpus - long runnning slow jobs - e.g. preprocessing
--qos=qos_gpu-dev 2h / 32gpus - this is for getting allocation much faster - for dev work!

(with slightly smaller numbers haha)

@RemiLacroix-IDRIS
Copy link

The output of sacctmgr show qos -P:

Name|Priority|GraceTime|Preempt|PreemptExemptTime|PreemptMode|Flags|UsageThres|UsageFactor|GrpTRES|GrpTRESMins|GrpTRESRunMins|GrpJobs|GrpSubmit|GrpWall|MaxTRES|MaxTRESPerNode|MaxTRESMins|MaxWall|MaxTRESPU|MaxJobsPU|MaxSubmitPU|MaxTRESPA|MaxJobsPA|MaxSubmitPA|MinTRES
qos_cpu-dev|80|00:00:00|||cluster|||1.000000|cpu=96000|||||||||02:00:00|cpu=10240||10|cpu=10240|||
qos_gpu-dev|80|00:00:00|||cluster|||1.000000|cpu=10240,gres/gpu=512|||||||||02:00:00|cpu=640,gres/gpu=32||10|cpu=640,gres/gpu=32|||
qos_cpu-t3|50|00:00:00|||cluster|||1.000000|||||||cpu=40960|||20:00:00|cpu=96000||10000|cpu=96000|||
qos_gpu-t3|50|00:00:00|||cluster|||1.000000|||||||cpu=10240,gres/gpu=512|||20:00:00|cpu=20480,gres/gpu=1024||10000|cpu=20480,gres/gpu=1024|||
qos_cpu-t4|40|00:00:00|||cluster|||1.000000|cpu=10240||||||cpu=320|||4-04:00:00|cpu=2560|||cpu=2560|||
qos_gpu-t4|40|00:00:00|||cluster|||1.000000|cpu=10240,gres/gpu=512||||||cpu=3600,gres/gpu=180|||4-04:00:00|cpu=3600,gres/gpu=180|||cpu=3600,gres/gpu=180|||

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants