Practical Computing Exercise for Week 2 :Reworking of the Tree Diameters example Solutions

Aims of this practical exercise

In this exercise you will:

rework some of the example given in ELMER
get some practice using R markdown

Before you undertake this exercise…

You need to have installed R, RStudio, and the necessary packages for the course, including the ELMER package. See how to get set up for this course

data(TreeDiams, package="ELMER")
str(TreeDiams)

'data.frame':   12 obs. of  2 variables:
 $ Diameter: num  0.9 1.2 2.9 3.1 3.3 3.9 4.3 6.2 9.6 12.6 ...
 $ Height  : num  18 26 32 36 44.5 35.6 40.5 57.5 67.3 84 ...

Fit the models

Fit the models used in the example in Chapter 1 of ELMER.

TreeDiams.lm1 = lm(Height~Diameter, data=TreeDiams)
TreeDiams.lm2 = lm(Height~log(Diameter), data=TreeDiams)
TreeDiams.lm3 = lm(log(Height)~log(Diameter), data=TreeDiams)

Make the graph

Just create the graph that shows the fitted models on the original data scale because this is the one that is important.

N.B. You should try to do this using ggplot() if you can. To get the curves of your models on your scatter plot, you might make use of geom_function()

TreeDiams |> ggplot(aes(y=Height, x=Diameter)) + geom_point( ) +
    geom_smooth(method="lm", se=FALSE) +
 geom_function(fun = function(x) coef(TreeDiams.lm2)[1] + coef(TreeDiams.lm2)[2]*log(x), lty=2) +
 geom_function(fun = function(x) exp(coef(TreeDiams.lm3)[1] + coef(TreeDiams.lm3)[2]*log(x)), lty=3)

`geom_smooth()` using formula 'y ~ x'

Make the tables of fitted values

Use predict() to find the fitted values from the three models used so far. Use kable() to put them into a nice table.

    PredData <- data.frame(Diameter=c(5, 10, 25))
    Fits1<- predict(TreeDiams.lm1, PredData, se.fit=T)
    Fits2<- predict(TreeDiams.lm2, PredData, se.fit=T)
    Fits3<- predict(TreeDiams.lm3, PredData, se.fit=T)
    TreeDiamsTable <- cbind(Fits1$fit, Fits1$fit-Fits1$se.fit, Fits1$fit+Fits1$se.fit, Fits2$fit, Fits2$fit-Fits2$se.fit, Fits2$fit+Fits2$se.fit, exp(Fits3$fit), exp(Fits3$fit-Fits3$se.fit), exp(Fits3$fit+Fits3$se.fit))
    rownames(TreeDiamsTable) <- c("5 inch", "10 inch", "25 inch")
    colnames(TreeDiamsTable) <- rep(c("Mean", "-SE", "+SE"), 3)
    TreeDiamsTable |> kable()

	Mean	-SE	+SE	Mean	-SE	+SE	Mean	-SE	+SE
5 inch	42.89175	39.55850	46.22499	50.33222	48.20356	52.46088	45.49154	43.75472	47.29731
10 inch	56.47018	53.13448	59.80588	65.18187	62.51958	67.84416	62.77474	59.79172	65.90658
25 inch	97.20547	88.83945	105.57149	84.81204	80.60945	89.01463	96.08644	88.97853	103.76216

Additional exercises

Calculate a regression of height on diameter and height on log(Diameter) omitting the largest tree in the TreeDiams data.

TreeDiams2  <- TreeDiams |> filter(Height<max(Height)) |> mutate(LogDiameter=log(Diameter)) |> glimpse()

Rows: 11
Columns: 3
$ Diameter    <dbl> 0.9, 1.2, 2.9, 3.1, 3.3, 3.9, 4.3, 6.2, 9.6, 12.6, 16.1
$ Height      <dbl> 18.0, 26.0, 32.0, 36.0, 44.5, 35.6, 40.5, 57.5, 67.3, 84.0…
$ LogDiameter <dbl> -0.1053605, 0.1823216, 1.0647107, 1.1314021, 1.1939225, 1.…

TreeDiams2.lm1 = lm(Height~Diameter, data=TreeDiams2)
TreeDiams2.lm2 = lm(Height~LogDiameter, data=TreeDiams2)

Replot the scatter plots using the reduced data, with the fitted lines added.

TreeDiams2 |> ggplot(aes(y=Height, x=Diameter)) + geom_point( ) +
    geom_smooth(method="lm") +
 geom_function(fun = function(x) coef(TreeDiams2.lm2)[1] + coef(TreeDiams2.lm2)[2]*log(x), lty=2)

`geom_smooth()` using formula 'y ~ x'

Which of the two regressions would you choose, and why?

Massey University — School of Mathematical and Computational Sciences

161.331 Biostatistics

Staff member responsible for this workshop: Jonathan Godfrey a.j.godfrey@massey.ac.nz