Download the template R markdown file for this workshop.
In this exercise you will:
You need to have installed R, RStudio, and the necessary packages for
the course, including the ELMER
package. See how to
get set up for this course
The quine
data is in the MASS
package. look
at its help page for a description using ?quine
.
data(quine, package="MASS")
str(quine)
'data.frame': 146 obs. of 5 variables:
$ Eth : Factor w/ 2 levels "A","N": 1 1 1 1 1 1 1 1 1 1 ...
$ Sex : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
$ Age : Factor w/ 4 levels "F0","F1","F2",..: 1 1 1 1 1 1 1 1 2 2 ...
$ Lrn : Factor w/ 2 levels "AL","SL": 2 2 2 1 1 1 1 1 2 2 ...
$ Days: int 2 11 14 5 5 13 20 22 6 6 ...
There are many different approaches for finding a transformation. One
major problem with the log transformation is that it cannot handle
response values of zero. A common tweak is to add a small increment to
the zero values in the data; another is to add a constant to all
response values. The logtrans()
function in the
MASS
package will help find a suitable \(\alpha\) for the expression \(y^\prime=\log(y+\alpha)\) for the
transformed response variable. We will investigate the benefits of this
transformation using the example for this function.
library(MASS)
example(logtrans)
Here is the text from the help page for your convenience…
logtrans(Days ~ Age*Sex*Eth*Lrn, data = quine,
alpha = seq(0.75, 6.5, len=20))
Q: Confirm that this transformation is sensible under the Box-Cox paradigm. That is, make a transformed response variable (\(y+\alpha\)) using the preferred \(\alpha\) and check that the Box-Cox methodology does suggest the log transformation is appropriate.
Q: The model used in this example is multiplicative. Determine if the transformation suggested is appropriate if an additive model is to be used instead. That is, remove all interactions and check that the selection of \(\alpha\) remains appropriate.
You should compare your work with the solutions for this workshop.