Paritions in MonARCH
Partitions Available​
All nodes currently belong to one big partition. Users discriminate between different hardware types with extra parameters within their submission script.
The default partition for all submitted jobs is:
- comp for compute nodes. The maximum walltime is one week.
Other partitions include:
- short for jobs with a walltime < 1 day.
- gpu for the GPU nodes
Example: To use the short partition for jobs < 1 hour, put this in your SLURM submission script.
#SBATCH --partition=short
#SBATCJ --time=01:00:00
Selecting a particular CPU Type​
The hardware available consists of several sort of nodes: All nodes have hyper-threading turned off.
- mi* nodes are 36 core Xeon-Gold 6150 @ 2.70GHz servers wtih 158893MB usable memory
- gp* nodes are 28 core Xeon-E5-2680-v4 @ 2.40GHz servers with 241660MB usable memory. Each server has two P100 GPU cards.
- md* nodes are 48 cores Xeon-Gold-5220R @ 2.20GHz servers with 735000MB usable memory
Sometimes users may want to constrain themselves to use a particular CPU type, e.g. for timing reaons. In this case, they need to specify this with a constraint flag in the SLURM submissions script. You specify the CPU type you need. The CPU type of a particular node can be viewed by runinng this command:
scontrol show node <nodename>
command and then looking for the Feature field.
Examples:
# this command requests only mi* nodes that have Xeon-Gold processors
#SBATCH --constraint=Xeon-Gold
or
# this command requests only hc* nodes
#SBATCH --constraint=Xeon-E5-2680-v3
This feature should only be used if you must have a particular processor. Jobs will schedule faster if you do not use it.
Selecting a particular server​
Users can specify to use only a particular server if they wish.
Example: Only run jobs on server ge00
#SBATCH --nodelist=ge00
Selecting a GPU Node​
To request one or more GPU cards, you need to specify:
- the gpu partition
- the name and type in a gres statement. Your running program will only be allowed access to the number of cards that you specify. You should not use the constraint feature described above.
# this command requests one A100 card on a node.
#SBATCH --gres=gpu:A100:1