Practical Computing Exercise for Week 2 :Reworking of the Tree Diameters example Solutions

Aims of this practical exercise

In this exercise you will:

  • rework some of the example given in ELMER
  • get some practice using R markdown

Before you undertake this exercise…

You need to have installed R, RStudio, and the necessary packages for the course, including the ELMER package. See how to get set up for this course

data(TreeDiams, package="ELMER")
str(TreeDiams)
'data.frame':   12 obs. of  2 variables:
 $ Diameter: num  0.9 1.2 2.9 3.1 3.3 3.9 4.3 6.2 9.6 12.6 ...
 $ Height  : num  18 26 32 36 44.5 35.6 40.5 57.5 67.3 84 ...

Fit the models

Fit the models used in the example in Chapter 1 of ELMER.

TreeDiams.lm1 = lm(Height~Diameter, data=TreeDiams)
TreeDiams.lm2 = lm(Height~log(Diameter), data=TreeDiams)
TreeDiams.lm3 = lm(log(Height)~log(Diameter), data=TreeDiams)

Make the graph

Just create the graph that shows the fitted models on the original data scale because this is the one that is important.

N.B. You should try to do this using ggplot() if you can. To get the curves of your models on your scatter plot, you might make use of geom_function()

TreeDiams |> ggplot(aes(y=Height, x=Diameter)) + geom_point( ) +
    geom_smooth(method="lm", se=FALSE) +
 geom_function(fun = function(x) coef(TreeDiams.lm2)[1] + coef(TreeDiams.lm2)[2]*log(x), lty=2) +
 geom_function(fun = function(x) exp(coef(TreeDiams.lm3)[1] + coef(TreeDiams.lm3)[2]*log(x)), lty=3)
`geom_smooth()` using formula 'y ~ x'

Make the tables of fitted values

Use predict() to find the fitted values from the three models used so far. Use kable() to put them into a nice table.

    PredData <- data.frame(Diameter=c(5, 10, 25))
    Fits1<- predict(TreeDiams.lm1, PredData, se.fit=T)
    Fits2<- predict(TreeDiams.lm2, PredData, se.fit=T)
    Fits3<- predict(TreeDiams.lm3, PredData, se.fit=T)
    TreeDiamsTable <- cbind(Fits1$fit, Fits1$fit-Fits1$se.fit, Fits1$fit+Fits1$se.fit, Fits2$fit, Fits2$fit-Fits2$se.fit, Fits2$fit+Fits2$se.fit, exp(Fits3$fit), exp(Fits3$fit-Fits3$se.fit), exp(Fits3$fit+Fits3$se.fit))
    rownames(TreeDiamsTable) <- c("5 inch", "10 inch", "25 inch")
    colnames(TreeDiamsTable) <- rep(c("Mean", "-SE", "+SE"), 3)
    TreeDiamsTable |> kable()
Mean -SE +SE Mean -SE +SE Mean -SE +SE
5 inch 42.89175 39.55850 46.22499 50.33222 48.20356 52.46088 45.49154 43.75472 47.29731
10 inch 56.47018 53.13448 59.80588 65.18187 62.51958 67.84416 62.77474 59.79172 65.90658
25 inch 97.20547 88.83945 105.57149 84.81204 80.60945 89.01463 96.08644 88.97853 103.76216

Additional exercises

Calculate a regression of height on diameter and height on log(Diameter) omitting the largest tree in the TreeDiams data.

TreeDiams2  <- TreeDiams |> filter(Height<max(Height)) |> mutate(LogDiameter=log(Diameter)) |> glimpse()
Rows: 11
Columns: 3
$ Diameter    <dbl> 0.9, 1.2, 2.9, 3.1, 3.3, 3.9, 4.3, 6.2, 9.6, 12.6, 16.1
$ Height      <dbl> 18.0, 26.0, 32.0, 36.0, 44.5, 35.6, 40.5, 57.5, 67.3, 84.0…
$ LogDiameter <dbl> -0.1053605, 0.1823216, 1.0647107, 1.1314021, 1.1939225, 1.…
TreeDiams2.lm1 = lm(Height~Diameter, data=TreeDiams2)
TreeDiams2.lm2 = lm(Height~LogDiameter, data=TreeDiams2)

Replot the scatter plots using the reduced data, with the fitted lines added.

TreeDiams2 |> ggplot(aes(y=Height, x=Diameter)) + geom_point( ) +
    geom_smooth(method="lm") +
 geom_function(fun = function(x) coef(TreeDiams2.lm2)[1] + coef(TreeDiams2.lm2)[2]*log(x), lty=2)
`geom_smooth()` using formula 'y ~ x'

Which of the two regressions would you choose, and why?