Frequently Asked Questions

General

How do I cite RCC in my publications and talks?

Citations and acknowledgements help RCC demonstrate the importance of computational resources and support staff in research at UChicago. We ask that an acknolwedgment be given to RCC in any publications, presentations, or talks that were made posisble by resources RCC provided. Please reference the RCC as “The University of Chicago Research Computing Center” in citations and acknowledgements. Acceptable example citations are below.

  • This work was completed in part with resources provided by the University of Chicago Research Computing Center.
  • We are grateful for the support of the University of Chicago Research Computing Center for assistance with the calculations carried out in this work.
  • We acknowledge the University of Chicago Research Computing Center for support of this work.

If you cite or acknowledge RCC in your work, please alert RCC by sending a brief email to info@rcc.uchicago.edu

Getting Started

How do I become an RCC user?

RCC user account requests should be submitted via our online application forms. See RCC Account Request for more information.

How do I access RCC systems?

There are various ways to access RCC systems.

How do I request access to a PI’s account if I already have an account on Midway?

Please submit a User Account Request and provide both your CNetID and the PI Account name (typically pi- followed by the CNetID of the PI). The PI will receive an automated email requesting authorization for this request.

What is my RCC username and password?

RCC uses University of Chicago CNetIDs for user credentials. When your RCC account is created, your RCC username/password will be the same as your CNetID credentials.

Can an external collaborator get a CNetID so they can log in to RCC?

RCC can create CNetIDs for external collaborators as necessary. See RCC Account Request for more information.

What do I do if I left UChicago and my CNetID password no longer works?

It should be possible to use your CNetID for authentication indefinitely, but IT services may expire it when you leave. If you have an RCC account, but you still can’t log in, it is likely that password authentication has been disabled by IT services. Contact help@rcc.uchicago.edu if you believe your account should be reinstated.

How do I change/reset my password?

RCC cannot change or reset your password. Go to the CNet Password Recovery page to change or reset your password.

What groups am I a member of?

To list the groups you are a member of, type groups on any RCC system.

How do I access the data visualization lab in the Zar room of Crerar Library?

The Zar room and its visualization equipment can be reserved for events, classes, or visualization work by contacting RCC at help@rcc.uchicago.edu. More information regarding RCC’s visualization facilities can be found on the RCC Data Visualization webpage.

What login shells are supported and how do I change my default shell?

RCC supports the following shells:

  • /bin/bash
  • /bin/tcsh
  • /bin/zsh

Use this command to change your shell:

$ chsh -s /path/to/shell

It may take up to 30 minutes for that change to become active.

Is remote access with Mosh supported?

Yes. To use Mosh, first log in to Midway via SSH, and add the command module load mosh to your ~/.bashrc (or ~/.zshenv if you use zsh). Then, you can log in by entering the following command in a terminal window:

$ mosh <CNetID>@midway.rcc.uchicago.edu

Is SSH key authentication allowed on RCC machines?

SSH key pair authentication is allowed on Midway. RCC recommends using passphrase-protected private keys in combination with an SSH agent on your client machine for security. Add your public key to $HOME/.ssh/authorized_keys on Midway to allow access from your machine.

Why is SSH key authentication not working even though I’ve added my public key to $HOME/.ssh/authorized_keys?

The default umask on Midway can cause an issue with SSH key authentication. Use this command to correct the permissions if there are problems using key-based authentication:

$ chmod -R g-w ~/.ssh

Why am I getting “ssh_exchange_identification: read: Connection reset by peer” when I try to log in via SSH ?

You can get this error if your account was just added and you try to login too many times before the account is fully provisioned or you incorrectly enter your password too many times. This is done to limit the ability for malicious users to use brute force SSH attacks against our systems. The limits for an invalid user is 5 failed login attempts and for a valid user entering a bad password is 10 failed login attempts. If these limits are exceeded, an IP address block will be added and will remain active for 4 hours. If necessary, contact RCC support and we can remove the block earlier.

Allocations

What is an allocation?

An allocation is a quantity of computing time and storage resources that are granted to a PI or Project account. An allocation is necessary to run jobs on RCC systems. See RCC Allocations for more details.

What is a service unit (SU)?

A service unit (SU) is an abstract quantity of computing resources defined to be equal to one core*hour.

How do I obtain an allocation?

The RCC accepts proposals for large allocations bi-annually. Medium sized allocations, special purpose allocations for time-critical research, or allocations for education and outreach may be submitted at any time. See RCC Allocations for more information.

How is my usage charged to my account?

The charge associated with a job on Midway is the product of the following factors:

  • The number of cores assigned to the job
  • The elapsed wall-clock time in hours

Multiplying these quantities results in the number of SUs deducted from your account. Usage is tracked in units of 0.01 SU.

How do I check the balance of my allocation?

The accounts tool provides an easy way for users to check their account balance. Use the following option to query your balance in the current allocation period

$ accounts balance

How do I see how my allocation has been used?

The accounts tool has a number of methods for summarizing allocation usage. To see an overall summary use

$ accounts usage

To see the individual jobs that contribute to that usage use the --byjob option

$ accounts usage --byjob

Software

What software does RCC offer on its compute systems?

Software available within the RCC environment is continuously changing and we often add new software and versions of existing software. Information about available software and how to use specific software pacakges can be found in the Software section of this manual.

To view the the current list of installed software, log in to any RCC system and use the command:

$ module avail

To view software versions available for a specific piece of software use this command:

$ module avail <software>

How do I use the modules utility?

The module system can be accessed by entering module on the command line. More information can also be found in the Software section of the User Guide.

How do I get help with RCC software?

The primary resource for any software is the official documentation and manual which can be accessed by issuing the following command:

$ man <command>

RCC also maintains supplementary documentation for issues specific to our systems, including basic usage and customizations. Consult the Software page for more information.

Why is command XXX not available?

You probably have not loaded the appropriate software module. To use most software packages, you must first load the appropriate software module. See Software for more information on how to use pre-installed software on RCC systems.

Why do I get an error that says a module cannot be loaded due to a conflict?

Some modules are incompatible with each other and cannot be loaded simultaneously. This is especially true for software that provides the same commands, such as MPI implementations which all provide mpirun and mpiexec commands.

The module command typically gives you a hint about which module conflicts with the one you are trying to load. If you see such an error you will need to remove the previously loaded module with the command:

$ module unload <module name>

How do I request installation of a new or updated software package?

Send email to help@rcc.uchicago.edu with the details of your software request including what software package you need, which version, and any optional dependencies you require.

Why do all module commands in my scripts fail with module command not found?

Depending on your shell or working environment, it is possible that the module setup commands aren’t run. To correct this you need to source the appropriate shell startup scripts in your script.

  • For bash/sh/zsh add source /etc/profile
  • For tcsh add . /etc/csh.cshrc

Why can’t I run Gaussian?

Gaussian’s creators have a strict usage policy, so we have limited its availability on RCC systems. If you need to use Gaussian for your research, please contact help@rcc.uchicago.edu to request access.

Cluster Usage

How do I submit a job to the queue?

RCC systems use Slurm to manage resources and job queues. For more information on how to run specific types of jobs consult the Running Jobs on Midway section of this manual.

Can I login directly to a compute node?

You may obtain a shell on a compute node through the utility sinteractive. This command takes the same arguments as sbatch. More information about interactive jobs is aviablable here: Interactive Jobs

How do I run a set of jobs in parallel?

There are a variety of methods for configuring parallel jobs based on the software package and resource requirements. Two commonly used approaches are Parallel Batch Jobs and Job Arrays.

What are the queue limits?

Run accounts qos on Midway to view the current limits.

If I belong to multiple accounts how do I choose which one is charged?

If you have multiple accounts, jobs will get charged to your default account unless you add the --account=<account> option to your submit options.

You can see your default account with this command:

sacctmgr list user $USER

If you would like to permanently change your default account you can run this command:

sacctmgr modify user $USER set defaultaccount=<account>

Or contact RCC support and we will change it.

Why isn’t my job starting?

There are a number of reasons that your job may be sitting in the queue. The output of the command squeue typically will help determine why your job is not running. Look at the NODELIST(REASON). A pending job may have these reasons:

  • (Priority): Other jobs have priority over your job.
  • (Resources): Your job has enough priority to run, but there aren’t enough free resources to run it.
  • (QOSResourceLimit): Your job exceeds the QOS limits. The QOS limits include wall time, number of jobs a user can have running at once, number of nodes a user can use at once, etc. This may or may no be a permanent status. If your job requests a wall time greater than what is allowed or exceeds the limit on the number of nodes a single job can use, this status will be permanent. However, your job may be in this status if you currently have jobs running and the total number of jobs running or aggregate node usage is at your limits. In this case, jobs in this state will become eligible when your existing jobs finish.

Please contact RCC support if you feel that your job is not being properly queued.

Note

If you see a large number of jobs aren’t running when resources are idle, RCC staff may have an upcoming maintenance window. Your job may be requesting a wall time which will overlap our maintenance window, which will cause the job to stay in the queue until after maintenance is performed. RCC staff will notify users via email both prior to performing maintenance and after the maintenance is completed.

Why does my job fail after a couple of seconds?

There is most likely a problem in your job submission script (ex: the program you are attempting to run cannot be found by a compute node), or the program you are attempting to run is producing an error and terminating prematurely.

If you require further assistance troubleshooting the problem, send your submission script and output from your job to help@rcc.uchicago.edu.

Why does my job fail with “exceeded memory limit, being killed”?

By default, SLURM allocates 2GB of memory per CPU core being used in a job. This follows from the fact most midway nodes contain 16 cores and 32GB of memory. If your job requires more than the default amount of memory per core, you must include the --mem-per-cpu=<MB> in your sbatch job script. For example, to use 16 CPU cores and 256GB of memory on a bigmem node the required sbatch flags would be: --ntasks=16 --cpus-per-task=1 --mem-per-cpu=16000

Why does my sinteractive job fail with “Connection to <host> closed.”?

There are 2 likely possibilities for this error. The first problem could be that you are over the time limit. The default timeout for sinteractive is 2 hours. This can be increased with the --time=<timespec> option.

The second possiblity is that your job exceeded the memory limit. You will need to request additional memory with the --mem-per-cpu=<MB> option. See the above question above for more details.

How do I get the maximum wall time for my jobs increased?

The RCC queuing system attempts to provide fair and balanced resource allocation to all RCC users. The maximum wall time per job exists to prevent individual users from using more than their fair share of cluster resources.

If your particular job requires an extraordinary amount of wall time, please submit a special request for resources to help@rcc.uchicago.edu.

Can I create a cron job?

RCC does not support users creating cron jobs. However, it is possible to use Slurm to submit cron-like jobs. See Cron-Like Jobs for more information.

Performance and Coding

What compilers does RCC support?

RCC supports the GNU, Intel Composer, and PGI compiler suites. See the compilers section of this manual for more details.

Which versions of MPI does RCC support?

RCC maintains builds of OpenMPI, IntelMPI, and MVAPICH2 for supported compilers. See Message Passing Interface (MPI) for more documentation and samples for MPI.

Can RCC help me parallelize and optimize my code?

Support staff is available to consult with your research team to help parallelize and optimize your code for use on RCC systems. Contact RCC staff at help@rcc.uchicago.edu to arrange a consultation.

Does RCC provide GPU computing resources?

Yes. RCC maintains a number of GPU-equipped compute nodes. For details on how to submit jobs to the GPU nodes see GPU Computing Jobs.

File I/O, Storage, and Transfers

How much storage space have I used / do I have access to?

Use the quota command to list your current usage and available storage.

How do I get my storage quota increased?

Additional storage is available through the Cluster Partnership Program. In certain cases, additional storage can be aquired through a Research II Allocation or Special Allocation.

How do I share files?

Using your group’s /project directory is the preferred way to share data amongst group members. Project directories are created for all PI and project accounts. The default permissions restrict access to the project group account, but permissions can be customized to allow access to other users.

I accidentally deleted/corrupted a file, how do I restore it?

The best way to recover a deleted/corrupted file is from a snapshot. More information about snapshots is availabe here: Data Recovery and Backups.

How do I request a restore of my files from tape backup?

RCC maintains a tape backup of all home and project directories, but only for disaster recovery purposes. There is no long term history of files on tape. You should use file system snapshots to retrieve a previous version of a file or directory. See Data Recovery and Backups for more information.