Introduction

The 161.331 Biostatistics course makes use of statistical software for both learning and assessment. As you might have guessed, we use R and/or RStudio; as it happens, this is the case for most undergraduate Statistics courses at Massey. In some courses, the use of R has affected the way we have designed the course work, activities, or assessment.

Please read this document all the way through to understand what is expected of you. If you have trouble, please contact Jonathan Godfrey on or post to the 161.331 Stream site.

It is highly preferable that you get the installation of the software needed for 161.331 sorted out before the semester starts.

R

R is freely available and runs on Windows, Mac, and Linux operating systems. There are also web-based methods for running R, but we recommend that you download the correct version for your operating system at the beginning of the semester. Please download and install the latest version, especially if you last used R mor than six months ago. Students using much older versions of R creates headaches for staff and students alike.

While R is updated fairly often, It is very unusual for the minor changes from one version to the next to have a material impact on your learning experience. We advise sticking to the version of R you install at the start of semester. This document was written to help you get the installation right in such a way that your needs throughout the semester should be covered.

We will assume you install version 4.4.1 of R. It is available from CRAN - the Comprehensive R Archive Network. but please look at the specific instructions for your computer’s operating system.

Windows users

You can use a direct link to the latest installer for R under Windows. which will download an installation file. Depending on how you have your browser configured, you may be asked if you want to run the installer or download it first. We recommend downloading the file and then running it after it has finished downloading.

Mac users

The MAC version of R requires OSX 10.13 High Sierra or later. If your machine uses an older version of the Mac OS, then you will experience difficulty. If you cannot use a recent version of the Mac oS, or another computer, then please make contact with the course staff immediately. The Mac version of R can be downloaded here. There are versions of R available for older OSX distributions on the same page, but note that this may mean that you have more problems installing packages that will be needed. It is preferable to be running the latest R version if possible.

When you download a .pkg file, you’ll need to open the .pkg and then run the installer.

Linux users

Most major Linux distributions have instructions available here. e.g. Ubuntu instructions are here. Once repositories are configured, you will get updates to R and it’s base packages automatically.

RStudio

In addition to R, which does all the work, we strongly advise our students to make use of another application called RStudio. Whether you like to think of RStudio as a “front end” or “integrated development environment”, it is designed to make common tasks easier to manage than using R alone. While RStudio is a commercial operation, the software is made freely available to all users.

A download for RStudio is available from the RStudio download page., where you want the “RStudio Desktop FREE” version. Download the installer for the operating system you are running.

Windows users

Download and execute the single file for the latest version.

Mac users

The Mac version comes as a disk image (.dmg) file. You should open the downloaded file, and then drag the RStudio icon into the Applications folder to install. Once done, the .dmg file is no longer required.

Linux users

Linux versions typically come as a package. You’d use your package manager (e.g. apt/dpkg or yum) to install.

Additional packages

Much of the extensive power of R is delivered to end users by way of additional packages. Your standard installation delivers around twenty packages, but the development community has more than twenty-thousand additional packages. Obviously we don’t want to use up your valuable time searching through all of those packages so we have compiled the list of packages you will need to download once you have installed R and/or RStudio.

We recommend that all students install the rmarkdown and tidyverse packages. (instructions below)

Packages you need to install (in addition to the rmarkdown and tidyverse packages) for 161.331: ELMER GLMsData lme4 rsm rstatix

The ELMER package is not available through CRAN. You must download it from the class Stream site and install it from the downloaded file.

Many packages use functionality found in other packages. All those necessary packages will also be installed when they are needed. This means that you might see some packages being installed that you didn’t explicitly ask for. Don’t panic if this happens. A key example of this is the tidyverse package which doesn’t actually have any functionality. It just makes sure that a set of other packages are all installed. In the end, installing the tidyverse package is just a really useful timesaver.

Note for Windows and Mac users

Sometimes the installation process may prompt you that there is a “source package” available that is a later version than the binary version, and ask you which you prefer. You should generally answer this question to prefer the older binary package rather than the newer source package. If you are using commands, then this issue can be avoided by altering the installation commands as follows:

chooseCRANmirror(ind=1)
install.packages("tidyverse", dependencies=TRUE, type="binary")

(i.e. adding type='binary') to show that you prefer that version.

Further troubleshooting

The default options are usually the best options. Look through the following commonly asked questions to see if we’ve already addressed your question, but if all else fails, get in touch with the course staff before pain persists.

What is a “personal library”?

The standard installation of R gets put on hard drive space that is restricted to system administrators. If you are not a system administrator, then R/RStudio will ask if you would like to use a personal library instead. This just means that the additional packages get put under your user name instead of the system administrator name on the hard drive.

Why does RStudio ask to install more packages?

The most recent version of RStudio has tried to help users out when it thinks you don’t have a package installed that it wants. Install any needed packages if you see this warning message.