Large Memory Jobs¶
In the case that you require more than 64 GB of memory for your job, a node in the bigmem
or bigmem2
partitions can be
used. These two partitions include eight nodes (2 nodes on Midway1 and 6 nodes on Midway2) with the following specifications:
Partition Name | CPU Cores per Node | Number of Nodes | Memory per Node |
---|---|---|---|
bigmem | 16x E5-2670 @ 2.6 GHz | 1 | 256 GB |
bigmem | 32x E7-8837 @ 2.66 GHz | 1 | 1024 GB |
bigmem2 | 28x E5-2680v4 @ 2.4 GHz | 6 | 512 GB |
In the case where you are not sure how much memory your job was using, you can learn this (once your job finished) by using the maxrss
and maxvmsize
format options with sacct
, like so, with <jobid>
being the ID number of your job:
sacct -j <jobid> --format=jobid,jobname,partition,account,alloccpus,state,cputime,maxrss,maxvmsize
The last two columns of this output will be the maximum RAM and virtual memory size of all tasks in the job.
Running Large Memory Jobs on Midway¶
When submitting jobs to the big memory nodes, you must include one of the following #SBATCH
options in your job submission script:
# to run on Midway1
#SBATCH --partition=bigmem
Or
# to run on Midway2
#SBATCH --partition=bigmem2
The above options will ensure that your job runs on a node in either the bigmem
or bigmem2
partition.
Note
If you want to use the bigmem
partition, please make sure that you have compiled your code on one of the Midway1 login nodes and if you want to use the bigmem2
parition, please make sure that you have compiled your code on one of the Midway2 login nodes. Not doing so may result in getting your job crashed.
When running your job in the bigmem
partition, you may decide to run it on the 256 GB node. In this case, you must also include the following #SBATCH
option:
#SBATCH --constraint=256G
Similarly, if you need to ensure your job runs on the 1 TB node, you must include the following #SBATCH
option:
#SBATCH --constraint=1024G
If you have used maxrss
from above, or if you simply have a good idea of how much memory your job will need, you can use the options --cpus-per-task
and --mem-per-cpu
to request the amount of memory you need. However, if your job requires multiple CPUs, you will need to divide the total amount of memory you need by the number of CPUs you are requesting. For example, if you want to request 8 CPU cores and 128 GB of memory (128 GB / 8 cores = 16 GB/core), you would use the following #SBATCH
options:
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=16000
Running Large Memory Jobs Interactively¶
When using the bigmem
or bigmem2
partitions interactively, the same #SBATCH
options from above can be used. For example, to access the 256 GB node while requesting 1 CPU core and 128 GB of memory, you would use the following command:
sinteractive --partition=bigmem --constraint=256G --ntasks=1 --cpus-per-task=1 --mem-per-cpu=128000
Similarly, to access a node from the bigmem2
partition while requesting 8 CPU cores and 128 GB of memory, you would use the following command:
sinteractive --partition=bigmem2 --ntasks=1 --cpus-per-task=8 --mem-per-cpu=16000