Skip to content

hpsee/askci-term-slurm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 

Repository files navigation

slurm

Definition

SLURM refers to the "Simple Linux Utility for Resource Management" and is a job manager for high performance computing.

Commands

Quick Start

It's sometimes easiest to get started by trying out some commands.

What Jobs are currently running?

qstat

What Jobs am I currently running?

qstat -u username

Launch an interactive session on one node with 16 cores:

qrsh -pe omp 16

Launch a batch job one node with 16 cores:

qsub -pe omp 16 script.sh``

Cancel a batch job

qdel -j jobID

Cancel all my jobs

qdel -u username

Examples

Here are a few more detailed examples

Batch Scripts

Example batch file with directives that reserve one node in the default queue, with 16 cores and exclusive use of the node:

#!/bin/bash
#SBATCH -N 1
#SBATCH -n 16
#SBATCH --time=1:00:00
#SBATCH --exclusive
<<shell commands that set up and run the job>>

Tools

The following tools are useful for interacting or otherwise using SLURM.

  • JobMaker is a small interface that a center can deploy, customized to their slurm.conf. Since the slurm.conf is readable by all nodes, a user can generate the data for the tool equivalently.
  • JobStats makes it easy for users to see status of jobs, and what resources were actually utilized from those requested.
  • doppler is a complementary web application to jobstats that shows users, and account job efficiency/resource wastage.
  • smanage is a tool developed out of Harvard to help with management of job arrays.

References

About

SLURM refers to the "Simple Linux Utility for Resource Management" and is a job manager for high performance computing.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published