Large Memory Jobs

In the case that you require more than 64 GB of memory for your job, a node in the bigmem or bigmem2 partitions can be used. These two partitions include eight nodes (2 nodes on Midway1 and 6 nodes on Midway2) with the following specifications:

Partition Name CPU Cores per Node Number of Nodes Memory per Node
bigmem 16x E5-2670 @ 2.6 GHz 1 256 GB
bigmem 32x E7-8837 @ 2.66 GHz 1 1024 GB
bigmem2 28x E5-2680v4 @ 2.4 GHz 6 512 GB

In the case where you are not sure how much memory your job was using, you can learn this (once your job finished) by using the maxrss and maxvmsize format options with sacct, like so, with <jobid> being the ID number of your job:

sacct -j <jobid> --format=jobid,jobname,partition,account,alloccpus,state,cputime,maxrss,maxvmsize

The last two columns of this output will be the maximum RAM and virtual memory size of all tasks in the job.

Running Large Memory Jobs on Midway

When submitting jobs to the big memory nodes, you must include one of the following #SBATCH options in your job submission script:

# to run on Midway1
#SBATCH --partition=bigmem

Or

# to run on Midway2
#SBATCH --partition=bigmem2

The above options will ensure that your job runs on a node in either the bigmem or bigmem2 partition.

Note

If you want to use the bigmem partition, please make sure that you have compiled your code on one of the Midway1 login nodes and if you want to use the bigmem2 parition, please make sure that you have compiled your code on one of the Midway2 login nodes. Not doing so may result in getting your job crashed.

When running your job in the bigmem partition, you may decide to run it on the 256 GB node. In this case, you must also include the following #SBATCH option:

#SBATCH --constraint=256G

Similarly, if you need to ensure your job runs on the 1 TB node, you must include the following #SBATCH option:

#SBATCH --constraint=1024G

If you have used maxrss from above, or if you simply have a good idea of how much memory your job will need, you can use the options --cpus-per-task and --mem-per-cpu to request the amount of memory you need. However, if your job requires multiple CPUs, you will need to divide the total amount of memory you need by the number of CPUs you are requesting. For example, if you want to request 8 CPU cores and 128 GB of memory (128 GB / 8 cores = 16 GB/core), you would use the following #SBATCH options:

#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=16000

Running Large Memory Jobs Interactively

When using the bigmem or bigmem2 partitions interactively, the same #SBATCH options from above can be used. For example, to access the 256 GB node while requesting 1 CPU core and 128 GB of memory, you would use the following command:

sinteractive --partition=bigmem --constraint=256G --ntasks=1 --cpus-per-task=1 --mem-per-cpu=128000

Similarly, to access a node from the bigmem2 partition while requesting 8 CPU cores and 128 GB of memory, you would use the following command:

sinteractive --partition=bigmem2 --ntasks=1 --cpus-per-task=8 --mem-per-cpu=16000