Cluster usage
The cluster is managed by a job submission system (called SLURM). Users submit jobs to a queue in the scheduling system, which then executes them on compute nodes in the cluster. The allocation of jobs to resources is managed by SLURM and is based on the fair usage, requested resources, available resources, queue and account priorities.
Submitting jobs
To submit a job to the cluster, use the sbatch command:
sbatch myjob.slr
Monitoring jobs
After submitting jobs to the cluster, you can monitor your jobs using:
squeue
This command lists all jobs on the cluster, including their job ID, the partition to which they have been submitted, the job name and user. It also lists the status of the job and the time it has been running. The reason why the job is not running may also be shown for jobs in the queue.
Partition information
To obtain information about the partitions on the cluster, use the command:
sinfo
For example, the output might look similar to:
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
test up 30:00 2 idle hurricane,spitfire
short* up 6:00:00 1 drain comet
short* up 6:00:00 1 mix dragon
short* up 6:00:00 4 idle fury,gauntlet,hornet,mohawk
medium up 1-00:00:00 1 mix dragon
medium up 1-00:00:00 3 idle fury,gauntlet,mohawk
long up 14-00:00:0 1 mix dragon
long up 14-00:00:0 1 idle gauntlet
gpu up 1-00:00:00 2 idle meteor,typhoon