.. index:: single: Stata .. _mdoc_stata: ======= Stata_ ======= Stata is a powerful statistical software package that is widely used in scientific computing. RCC users are licensed to use Stata on all RCC resources. Stata can be used interactively or as a submitted script. Please note that if you would like to run it interactively, you must still run it on a compute node, in order to keep the login nodes free for other users. Stata can be run in parallel on up to 16 nodes. .. note:: Stata examples in this document are adapted from a Princeton tutorial_. You may find it useful if you are new to Stata or want a refresher. .. _tutorial: http://data.princeton.edu/stata/ .. contents:: :local: Getting Started ------------------ If you need to use the Stata GUI, connect to Midway with :ref:`thinlinc`. Obtain an interactive session on a compute node. This is necessary so that your computation doesn't interrupt other users on the login node. Now, load Stata: .. code:: bash sinteractive module load stata xstata This will open up a Stata window. The middle pane has a text box to enter commands at the bottom, and a box for command results on top. On the left there's a box called "Review" that shows your command history. The right-hand box contains information about variables in the currently-loaded data set. One way Stata can be used is as a fancy desktop calculator. Type the following code into the command box: .. code:: display 2+2 Stata can do much more if data is loaded into it. The following code loads census data that ships with Stata, prints a description of the data, then creates a graph of life expectancy over GNP: .. code:: sysuse lifeexp describe graph twoway scatter lexp gnppc Running Stata from the command line ------------------------------------------- This is very similar to running graphically; the command-line interface is equivalent to the "Results" pane in the graphical interface. Again, please use a compute node if you are running computationally-intensive calculations: .. code:: bash sinteractive module load stata stata Running Stata Jobs with SLURM ------------------------------------ You can also submit Stata jobs to SLURM, the scheduler. A Stata script is called a "do-file," which contains a list of Stata commands that the interpreter will execute. You can write a do-file in any text editor, or in the Stata GUI's do-file editor: click "Do-File Editor"" in the "Window" menu. If your do-file is named "example.do," you can run it with either of the following commands: .. code:: bash stata < example.do stata -b do example.do Here is a very simple do-file, which computes a regression on the sample data set from above: .. literalinclude:: stata_example.do Here is a submission script that submits the Stata program to the default queue on Midway: .. literalinclude:: stata_example.sbatch :language: bash :download:`stata_example.do` is our example do-file, and :download:`stata_example.sbatch` is the submission script. To run this example, download both files to a directory on Midway. Enter the following command to submit the program to the scheduler: .. code:: bash sbatch stata_example.sbatch Output from this example can be found in the file named :file:`stata_example.log`, which will be created automatically in your current directory. Running Parallel Stata Jobs ___________________________ The parallel version of Stata, Stata/MP, can speed up computations and make effective use of RCC's resources. When running Stata/MP, you are limited to 16 cores and 5000 variables. Run an interactive Stata/MP session: .. code:: bash sinteractive module load stata stata-mp # or, for the graphical interface: xstata-mp Here is a sample do-file that would benefit from parallelization. It runs bootstrap estimation on another data set that ships with Stata. .. literalinclude:: stata_parallel.do Here is a submission script that will run the above do-file with Stata/MP: .. literalinclude:: stata_parallel.sbatch :language: bash Download :download:`stata_parallel.do` and :download:`stata_parallel.sbatch` to Midway, then run the program with: .. code:: bash sbatch stata_parallel.sbatch