Partitions in MonARCH
Partitions Available
All nodes currently belong to one big partition. Users discriminate between different hardware types with extra parameters within their submission script.
The default partition for all submitted jobs is:
- comp for compute nodes. The maximum walltime is one week.
Other partitions include:
- short for jobs with a walltime < 1 day.
- gpu for the GPU nodes
- himem restricted partition for the 1.5TB high memory machine. Please contact help desk if you need to use this.
Example: To use the short partition for jobs < 1 hour, put this in your Slurm submission script.
#SBATCH --partition=short
#SBATCJ --time=01:00:00
Selecting a particular CPU Type
The hardware available consists of several sort of nodes: All nodes have hyper-threading turned off.
- mi* nodes are 36 core Xeon-Gold 6150 @ 2.70GHz servers wtih 158893MB usable memory
- gp* nodes are 28 core Xeon-E5-2680-v4 @ 2.40GHz servers with 241660MB usable memory. Each server has two P100 GPU cards.
- md* nodes are 48 cores Xeon-Gold-5220R @ 2.20GHz servers with 735000MB usable memory
Sometimes users may want to constrain themselves to use a particular CPU type, e.g. for timing reaons. In this case, they need to specify this with a constraint flag in the Slurm submissions script. You specify the CPU type you need. The CPU type of a particular node can be viewed by runinng this command:
scontrol show node <nodename>
command and then looking for the Feature field.
Examples:
# this command requests only mi* nodes that have Xeon-Gold processors
#SBATCH --constraint=Xeon-Gold
or
# this command requests only hc* nodes
#SBATCH --constraint=Xeon-E5-2680-v3
This feature should only be used if you must have a particular processor. Jobs will schedule faster if you do not use it.
Selecting a particular server
Users can specify to use only a particular server if they wish.
Example: Only run jobs on server ge00
#SBATCH --nodelist=ge00
Selecting a GPU Node
To request one or more GPU cards, you need to specify:
- the gpu partition
- the name and type in a gres statement. Your running program will only be allowed access to the number of cards that you specify. You should not use the constraint feature described above.
# this command requests one A100 card on a node.
#SBATCH --gres=gpu:A100:1