10 LURN... To Become More Efficient in your Workplace

[Version for blind users] [next] [prev] [prev-tail] [tail] [up]

Chapter 10
LURN… To Become More Efficient in your Workplace

This chapter is aimed at the efficient use of R. To make use of the functionality described here you must have read the material on storing your results in Chapter 9.

10.1 Managing R when you have multiple projects on the go

One of the greatest frustrations for novice users when using software like R is the management of the various datasets and projects they have. The temptation to give objects short names that mean something now and are utterly mystifying in six months time (perhaps when you go to do a clean out) is a common trap. If it is something temporary, give it a useless uninformative name; if it’s actually going to be important, give it a useful informative name when you first create it.

It’s also common to just start working in the main working directory and just keep on adding various data sets and analysis objects until such time as the working directory has hundreds of objects and becomes unwieldy. If you plan to use R for one or two analyses and then move on this might work, but this is not good practice.

My recommendation is to use different folders for the different projects you wish to work on. I make folders and subfolders for my R work in the same way I would for my word processed documents. By changing the working directory, we end up with a command history and stored workspace specific to the one project. It also means your graphs are saved here and are then easier to find. It might even prove useful to keep the R work and the associated word processed documents in the same folder.

If you change working directory, either using the menus or the setwd() command and save the workspace when quitting R, you will see that files called .RData and .Rhistory have been created. When you open R and change to this working directory, you will have access to the relevant data and command history.

The following is useful as a Windows user of R. As an additional step towards efficiency, I copy the shortcut used to start R into each working directory. It will need slight editing to make sure the shortcut doesn’t quite take the default action by deleting the contents of the “Start in" dialogue box for the shortcut’s properties. In this way, I can use Windows Explorer to move to the folder for a certain project and click the shortcut for R to get started on a project at the point where I left off.

10.2 Batch processing commands

When your R work gets too many commands that must be processed in order, it’s a good idea to use the R script window or a text editor to create a file with the commands in it. Keeping your commands in a text file means you can share them with colleagues very easily.

If you choose to use the R script window, then the commands are typed up and are not executed until you specifically request for them to be executed. This is achieved using the pull down menus.

If you have saved the commands in a text file, and this text file is placed in your working directory, you will be able to execute the commands using the source() command.

As a simple experiment, type up some commands in a new script window. Execute them using the pull down menus. Now save the commands in a text file, and go to the R console and type source("myfile.R") — note that I used the extension “R" in the filename here. This is common practice among R users.

If you’re not using the main Windows version of R, you will be entering commands in a terminal window. In this case, it’s probably easiest to use your favourite text editor software to type up and save your R commands. It doesn’t matter which version of R is being used, the source() command will behave consistently.

10.3 Running an R script without opening R under Windows

Running your R programs on another computer is a great idea because it means you can get on with another project while a long program is running. You’ll probably need some support from your local network administrator to achieve that, but in the meantime you can see how it’s done by running some jobs in the background on your own machine.

In the previous section I discussed the ability to use the source() command to import a text file of commands into an R session.

To continue with the rest of this section we need to know where to find R on our local machine. On my computer this was

C:\Program Files\R\R-3.5.0

because I installed this version of R using the default settings. This folder is often called R_HOME. If you haven’t installed R using the default settings, you will need to find out where your R_HOME is and remember it in the instructions that follow.

What we actually want is to find a file called rterm.exe which is in the bin subfolder. The full filename including its path is going to be

C:\Program Files\R\R-3.5.0\bin\Rterm.exe

Note that this is the file that will run R in a terminal window, not rgui.exe which is for the standard R console. Note also that as from version 2.12.0, R running under Windows had an extra element in the folder structure which depends on the machine you are using. Many Windows users are now moving to 64 bit operating systems and software, while others still use the older 32 bit software. Check your path carefully.

We now need to create a batch file. This is just a text file with the extension bat instead of the normal txt. If you open a text file it is readable in a text editor like notepad; a batch file is a set of commands that are executed when you run the file. On Windows, this is as simple as pressing <Enter> when the filename is highlighted in the Explorer window, or double clicking as if you were opening the file.

So let’s start by creating a text file in the chosen working directory. Do this using the menus in Windows Explorer. Open the file so that we are able to add the following text.

"C:\Program Files\R\R-3.5.0\bin\Rterm.exe" < mycommands.R > myoutput.txt

This single command has three parts; the first part in quotes is the full path to the program we want to run, and the quotes are essential. The second is the set of commands we want executed in that program, and the last is the file where we want the results to be saved. The symbols between the three parts are essential and the direction is important. This process is often called “piping". The contents of the commands file are piped into R, and the results are piped out to another file.

Save the text file and change its filename so that the extension is bat. Run the file. If the full path to the rterm.exe was correct for your machine, and the file of commands mycommands.R exists in the current directory, you should see that a new text file called myoutput.txt has been created. You can open this file to see what happened.

A word of warning

This approach works well for programs that have no bugs. Finding where a bug exists is not impossible but can be frustrating if there are too many bugs to fix. The program will stop if a line of code contains an error of any kind. Unfortunately, experimentation is probably the only way to know if things will work properly.

You could re-run your programs frequently in order to ensure that you are getting what you want. If it proves slow and frustrating to do this, it’s probably because you have a large data set or some computationally intensive elements in your program. If this is the case, you could use a subset of your data while you test your programs.

10.4 Using file associations in Windows

When I want to edit an R script file, I want it to open in my text editor of choice. Under Windows operating systems, this can be achieved by ensuring a file association is created. The simplest way to do this is to select the default program to be use when you want to open a file with the extension “.R".

If you do this, then the next logical thing to want to do is to have that R script processed using R itself. You could use the approach given earlier in this chapter but an even better way exists that uses file associations again.

We need to make a batch file again, just like in the example on piping. This batch file will have just one line in it and it needs to be put in a folder that is on the system path. The single line is

"C:\Program Files\R\R-3.5.0\bin\R.exe" CMD BATCH --vanilla --quiet %1

Now look for an R script and select the default program to open it again. This time, browse for the batch file you just created. Now every time you click on an R script, it will be processed by R. All output including commands will be put into a file of the same name with an extension that has the three extra letters “out" on the end. Use the process for file associations again to get this file opened automatically; use the viewer of your choice, whether that be a text editor or a browser.

The use of the two arguments vanilla and quiet ensure you have a fresh workspace and that the normal announcements of version number etc are not included in the output file. You might want to include these welcome messages so remove the quiet argument if you want. Another argument that might be useful is the save argument. Add that in if you want to be able to open R again in interactive mode later and be able to continue from where you left off.

You might want to go back now and re-assign the default program for an R script to be the text editor, and use the “open with" menu item to have it processed using your batch process.

10.5 Don’t re-invent any wheels

Sometimes it is obvious that you will need to write out some quite detailed and complex sets of commands, but when you can’t find the functionality within the base distribution of R, you might be tempted to write the code yourself. Before you do that though, I suggest you stop and think if the task is something that someone else is likely to have done before.

If the answer is yes, then you should look at the vast list of contributed packages available via the Comprehensive R Archive Network (CRAN) accessible under

http://CRAN.R-_project.org

be warned though that this is not necessarily a successful way of finding the right code. I prefer to do an internet search and hope the supporting documentation for any contributed packages has the same key words as you are thinking about. The data for many textbooks is also available as contributed packages so don’t be tempted to enter data unnecessarily.

[Version for blind users] [next] [prev] [prev-tail] [front] [up]