Skip to main content

Partitions and Quality of Service (QoS)

note

Make sure you have read Specifying resources in Slurm to understand how to use Slurm flags like --partition and --qos.

Executive summary

If you're an ordinary user on M3 you will likely only ever need to choose a partition and QoS when you want to use GPUs. In that case you should look at our GPUs on M3 though you may still want to read this page to understand what this actually means. If you have been granted access to a restricted partition (e.g. you are in FIT or CCEMMP) then you will want to read this page.

Partitions

What is a partition?

A partition is a collection of nodes. Generally all of the nodes in a given partition share some property e.g. each node in the gpu partition has GPUs. On M3 all of the partitions are disjoint i.e. no node will be in 2 different partitions.

How do I specify a partition?

You specify a partition with --partition e.g.

sbatch --partition=gpu ...

What is the default partition?

If you don't specify a --partition it will default to our comp` partition.

Quality of Service (QoS)

What is Quality of Service (QoS)?

In Slurm a Quality of Service (QoS) is used to apply restrictions on what a single user's jobs can do. Importantly on M3 some partitions require you to specify a QoS in order to use them.

How do I specify a QoS?

You specify a QoS with --qos. For example, if you have access to the fitq QoS you can specify it with

sbatch --qos=fitq ...

What QoS can I use?

The mon_qos command shows you which QoS you are allowed to use.

Available partitions on M3

The most up-to-date way of seeing all partitions on M3 is to use the show_cluster command.

CPU-only partitions

NameHow to use?Total nodesTotal coresCPUs per nodeMemory per node (GB)
General Computation--partition=comp (default)791864Up to 96Up to 1532
High-Density CPUs--partition=m3i4581018181
High-Density CPUs with High Memory--partition=m3j1119818373
High-Density CPUs with Extra High Memory--partition=m3m11818948
Short Jobs--partition=short23618181

GPU partitions

GPU typeHow to use?Total nodesTotal coresCPUs per nodeMemory per node (GB)Total GPUsGPUs per node
A100,T4,A40--partition=gpu20552Up to 28Up to 102052Up to 8
H100--partition=m3h --qos=m3h214472101084
V100--partition=m3g1934218Up to 37356Up to 3

Desktops

Desktop nodes have their own partition, but they are reserved for use by Strudel only. Please see our Strudel docs on running remote desktops. As of December 2024, desktop nodes have a mix of P4, T4, and A40 GPUs.

Restricted partitions

Some partitions on M3 are restricted to only certain groups of users. You will generally already know if one of these relevant to you because your supervisor or colleagues would have told you so.

Partition descriptionWho can access it?How to use?
Partition for standard jobs with four hour wall-time for omics communityGenomics community members--partition=genomics --qos=genomics
Partition with high-RAM nodes for omics communityGenomics community members--partition=genomicsb --qos=genomicsbq
Intended for real-time processing of data collected from instruments...--partition=rtqp --qos=rtq
Dedicated partition for Patrick Sexton's labMembers of the Sexton lab--partition=sexton --qos=sexton01
Partition dedicated to CCEMMPMembers of CCEMMP--partition=ccemmp --qos=ccemmp
Dedicated to Hudson Institute of Medical ResearchMembers of Hudson--partition=hudson --qos=hudson
FIT dedicated GPU nodesMembers of Faculty of IT (FIT)--partition=fit --qos=fitq
FIT dedicated CPU nodesMembers of Faculty of IT (FIT)--partition=fitc --qos=fitqc
BDI dedicated nodesMembers of Biomedicine Discovery Institute (BDI)--partition=bdi --qos=bdiq

Troubleshooting

Invalid qos specification

When you submit a job, if you see the error Invalid qos specification, it means either:

  • The QoS you specified does not actually exist. Run the command below to check if the QoS exists, replacing some-qos with the QoS name:
NAME=some-qos; test "$(sacctmgr show qos $NAME | wc -l)" -eq 2 && echo "QoS does NOT exist" || echo "QoS exists"
  • The QoS you specified does exist, but you don't have access to it. Check mon_qos to see which QoS you are allowed to use. You can check the restricted partitions table above to check if you're eligible to apply for access to the QoS.