Cluster usage

The cluster is managed by a job submission system (called SLURM). Users submit jobs to a queue in the scheduling system, which then executes them on compute nodes in the cluster. The allocation of jobs to resources is managed by SLURM and is based on the fair usage, requested resources, available resources, queue and account priorities.

Submitting jobs

To submit a job to the cluster, use the sbatch command:

sbatch myjob.slr

Monitoring jobs

After submitting jobs to the cluster, you can monitor your jobs using:

squeue

This command lists all jobs on the cluster, including their job ID, the partition to which they have been submitted, the job name and user. It also lists the status of the job and the time it has been running. The reason why the job is not running may also be shown for jobs in the queue.

Partition information

To obtain information about the partitions on the cluster, use the command:

sinfo

For example, the output might look similar to:

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
test         up      30:00      2   idle hurricane,spitfire
short*       up    6:00:00      1  drain comet
short*       up    6:00:00      1    mix dragon
short*       up    6:00:00      4   idle fury,gauntlet,hornet,mohawk
medium       up 1-00:00:00      1    mix dragon
medium       up 1-00:00:00      3   idle fury,gauntlet,mohawk
long         up 14-00:00:0      1    mix dragon
long         up 14-00:00:0      1   idle gauntlet
gpu          up 1-00:00:00      2   idle meteor,typhoon