GPU Computing Jobs

If you have an application capable of utilizing GPUs, you could submit a job to RCC’s gpu or gpu2 partitions. The gpu partition is on Midway1 and gpu2 is on Midway2. The two partitions include 15 GPU nodes in total (9 nodes on Midway1 and 6 nodes on Midway2) with the following specification:

Partition Name CPU Cores/GPUs per Node Number of Nodes Memory per Node
gpu 16x E5-2670 @ 2.6 GHz 2x Nvidia M2090 3 32 GB
gpu 16x E5-2670 @ 2.6 GHz 2x Nvidia K20m 2 32 GB
gpu 20x E5-2680v2 @ 2.8 GHz 2x Nvidia K40m 4 64 GB
gpu2 28x E5-2680v4 @ 2.4 GHz 4x Nvidia K80 6 128 GB

For information on compiling CUDA and OpenACC code on midway, see CUDA and OpenACC Compilers.

Running GPU code on Midway

When submitting jobs to the GPU nodes, you must use one of the following #SBATCH options:

# to run on Midway1 GPU nodes
#SBATCH --partition=gpu
#SBATCH --gres=gpu:<N>

OR

# to run on Midway2 GPU nodes
#SBATCH --partition=gpu2
#SBATCH --gres=gpu:<N>

The flag --gres=gpu:N is used to request N GPU devices on each of the nodes on which your job will run. Valid numbers for N in the gpu partition is either 1 or 2 and in the gpu2 partition is 1 to 4. If you are requesting all GPUs in a node, we also suggest including the #SBATCH --exclusive option in your submission script to prevent other jobs from being placed on that node.

Additionally, in the gpu partition, you can specify which type of GPU device your job runs on by including a --constraint option in your job submission script. To ensure your job runs on a M2090, K20, or K40 device, include one of the following lines in your submission script:

#SBATCH --constraint=m2090
or
#SBATCH --constraint=k20m
or
#SBATCH --constraint=k40m

An example GPU-enabled job script for a CUDA program is given below gpu.sbatch:

#!/bin/bash

# This script will request one GPU device and 1 CPU core

#SBATCH --job-name=gpuSbatch
#SBATCH --output=gpuSbatch.out
#SBATCH --error=gpuSbatch.err
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --partition=gpu				
#SBATCH --gres=gpu:1


# if your executable was built with CUDA, be sure to load the correct CUDA module:
module load cuda

# if your exectuable was built with PGI (OpenACC), be sure to load the PGI module:
module load pgi/2013

#
# your GPU-based executable here
#