In this exercise you will:
You need to have installed R, RStudio, and the necessary packages for
the course, including the ELMER
package. See how to
get set up for this course
data(Leukemia, package="ELMER")
str(Leukemia)
'data.frame': 17 obs. of 2 variables:
$ Time : int 65 156 100 134 16 108 121 4 39 143 ...
$ Count: num 3.36 2.88 3.63 3.41 3.78 4.02 4 4.23 3.73 3.85 ...
Re-create the figure and table given in the Leukemia data example in Chapter 3 of ELMER, which show the fitted values for the unweighted and weighted models.
<- lm(Time~Count, data=Leukemia)
Model <-Model
ModelUfor(i in 1:21){
<- (replace(predict(Model),predict(Model)<=0,0.001))^2 # variance = mean^2
Variance <- update(Model, weights=1/Variance)
Model
}<-Model ModelW
Note that the predicted values are squared to give the variances, which then get used as the denominator of the weights. Also note that a lot of iterations are needed to reach convergence. I used more than 20 to get the weighted model.
|> mutate(PredictionsU=predict(ModelU), PredictionsW=predict(ModelW)) |> kable(caption="Original data for the Leukemia patients as well as fitted values for unweighted and weighted models.") Leukemia
Time | Count | PredictionsU | PredictionsW |
---|---|---|---|
65 | 3.36 | 106.024326 | 93.33218 |
156 | 2.88 | 134.639342 | 115.29723 |
100 | 3.63 | 89.928379 | 80.97684 |
134 | 3.41 | 103.043595 | 91.04416 |
16 | 3.78 | 80.986187 | 74.11276 |
108 | 4.02 | 66.678679 | 63.13024 |
121 | 4.00 | 67.870971 | 64.04545 |
4 | 4.23 | 54.159609 | 53.52053 |
39 | 3.73 | 83.966918 | 76.40079 |
143 | 3.85 | 76.813164 | 70.90953 |
56 | 3.97 | 69.659410 | 65.41826 |
26 | 4.51 | 37.467517 | 40.70758 |
22 | 4.45 | 41.044394 | 43.45321 |
1 | 5.00 | 8.256354 | 18.28493 |
1 | 5.00 | 8.256354 | 18.28493 |
5 | 4.72 | 24.948447 | 31.09787 |
65 | 5.00 | 8.256354 | 18.28493 |
ggplot(data=Leukemia, aes(y=Time, x=Count)) +xlab(expression(paste(log[10]," (Blood Count)",sep=""))) + ylab("Survival (weeks)")+ geom_point() +
geom_abline(intercept=coef(ModelU)[1], slope=coef(ModelU)[2], lty=2)+
geom_abline(intercept=coef(ModelW)[1], slope=coef(ModelW)[2], col=2)
Voila!