.. index:: single: R .. _mdoc_R: =========== R =========== R is available for statistical computing. There are R modules built with both the GCC and Intel compilers. We recommend using the Intel builds since those have had the best performance during benchmarks. Some R packages may not compile correctly with the Intel compilers, so use the GCC version in that case. All R modules have been built with OpenMP enabled and use the Intel MKL to improve performance. The currently available R modules:: $ module avail R ------------------------- /software/modulefiles ----------------------------- R/2.15(default) R/2.15+intel-12.1 R/3.0 R/3.0+intel-12.1 R/3.1 R/3.1+intel-15.0 To install and use additional R packages to your home directory, it is necessary to set to set the environment variable ``R_LIBS_USER``. For example:: export R_LIBS_USER=$HOME/R_libs The directory specified should exist before trying to install R packages. The RStudio IDE is also available as the ``rstudio`` module. This provides a graphical interface for developing and running R. To use R in this mode, you should connect to midway via :ref:`thinlinc`. .. contents:: :local: Serial Examples =============== Here is a simple "hello world" example to submit an R job to Slurm. This is appropriate for an R job that expects to use a single CPU. sbatch script :download:`Rhello.sbatch` .. literalinclude:: Rhello.sbatch :language: bash R script :download:`Rhello.R`: .. literalinclude:: Rhello.R Output: .. literalinclude:: Rhello.out Parallel Examples ================= For parallel use there are several options depending on whether there should be parallel tasks on a single node only or multiple nodes and the level of flexibility required. There are other R packages available for parallel programming than what is covered here, but we'll cover some frequently used packages. Multicore --------- On a single node, it is possible to use doParallel and foreach. sbatch script :download:`doParallel.sbatch`: .. literalinclude:: doParallel.sbatch :language: bash R script :download:`doParallel.R`: .. literalinclude:: doParallel.R Output: .. literalinclude:: doParallel.out SNOW ---- For multiple nodes, you can use the SNOW package, which provides a select number of functions to simplify using multi-node clusters. sbatch script :download:`snow-test.sbatch`: .. literalinclude:: snow-test.sbatch :language: bash R script :download:`snow-test.R`: .. literalinclude:: snow-test.R Output (trimmed for readability): .. literalinclude:: snow-test.out Rmpi ---- For multiple nodes, you can also use Rmpi. This is what snow uses internally. It is less convenient than snow, but also more flexible. This page has a number of useful Rmpi examples: http://www.umbc.edu/hpcf/resources-tara-2010/how-to-run-R.php sbatch script :download:`Rmpi.sbatch`: .. literalinclude:: Rmpi.sbatch :language: bash R script :download:`Rmpi.R`: .. literalinclude:: Rmpi.R Output (trimmed for readability): .. literalinclude:: Rmpi.out