GPU Computing Jobs¶
If you have an application capable of utilizing GPUs, you could submit a job to RCC’s gpu
or gpu2
partitions. The gpu
partition is on Midway1 and gpu2
is on Midway2. The two partitions include 15 GPU nodes in total (9 nodes on Midway1 and 6 nodes on Midway2) with the following specification:
Partition Name | CPU Cores/GPUs per Node | Number of Nodes | Memory per Node |
---|---|---|---|
gpu | 16x E5-2670 @ 2.6 GHz 2x Nvidia M2090 | 3 | 32 GB |
gpu | 16x E5-2670 @ 2.6 GHz 2x Nvidia K20m | 2 | 32 GB |
gpu | 20x E5-2680v2 @ 2.8 GHz 2x Nvidia K40m | 4 | 64 GB |
gpu2 | 28x E5-2680v4 @ 2.4 GHz 4x Nvidia K80 | 6 | 128 GB |
For information on compiling CUDA and OpenACC code on midway, see CUDA and OpenACC Compilers.
Running GPU code on Midway¶
When submitting jobs to the GPU nodes, you must use one of the following #SBATCH
options:
# to run on Midway1 GPU nodes
#SBATCH --partition=gpu
#SBATCH --gres=gpu:<N>
OR
# to run on Midway2 GPU nodes
#SBATCH --partition=gpu2
#SBATCH --gres=gpu:<N>
The flag --gres=gpu:N
is used to request N
GPU devices on each of the nodes on which your job will run. Valid numbers for N in the gpu
partition is either 1 or 2 and in the gpu2
partition is 1 to 4. If you are requesting all GPUs in a node, we also suggest including the #SBATCH --exclusive
option in your submission script to prevent other jobs from being placed on that node.
Additionally, in the gpu
partition, you can specify which type of GPU device your job runs on by including a --constraint
option in your job submission script. To ensure your job runs on a M2090, K20, or K40 device, include one of the following lines in your submission script:
#SBATCH --constraint=m2090
or
#SBATCH --constraint=k20m
or
#SBATCH --constraint=k40m
An example GPU-enabled job script for a CUDA program is given below gpu.sbatch
:
#!/bin/bash
# This script will request one GPU device and 1 CPU core
#SBATCH --job-name=gpuSbatch
#SBATCH --output=gpuSbatch.out
#SBATCH --error=gpuSbatch.err
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
# if your executable was built with CUDA, be sure to load the correct CUDA module:
module load cuda
# if your exectuable was built with PGI (OpenACC), be sure to load the PGI module:
module load pgi/2013
#
# your GPU-based executable here
#