In this exercise you will:
RiverProbAll
data.You should only attempt this (second) exercise using this dataset after completing the first exercise.)
data(RiverProbAll, package="ELMER")
str(RiverProbAll)
'data.frame': 91 obs. of 3 variables:
$ Flow : int 24 24 26 26 50 48 72 72 25 23 ...
$ Type : Factor w/ 3 levels "Canadian","Kayak",..: 1 3 2 1 1 1 1 2 1 2 ...
$ Problems: int 1 0 0 0 1 0 1 2 1 3 ...
You’ll also want the subset used previously.
data(RiverProb, package="ELMER")
str(RiverProb)
'data.frame': 21 obs. of 2 variables:
$ Flow : int 24 36 24 25 72 37 46 24 37 46 ...
$ Problems: int 0 0 0 0 0 0 0 0 0 0 ...
The dataset RiverProbAll
gives the number of problems
experienced by travelers on the Whanganui river. This is the full data
set including the craft type as a categorical variable.
In the previous exercise, you fitted a generalised linear model to this data using a Poisson distribution for the number of problems, using a log link function.
The example in Chapter 3 of ELMER used a subset of the Whanganui river data for Other craft. An identity link was used in that analysis instead of a log link.
Calculate the predicted number of problems for a range of river flows using this model, and compare with the predictions for the same flow and Other craft type for the model in the previous exercise .
Compare the models using a graph.
We need the model from the last exercise and the coefficients from the Chapter 3 example which used the identity link. It makes sense to just use the subset data given we are just comparing fitted values but we should make sure the coefficients are the same for the Other craft.
<- glm(Problems~Flow*Type, family=poisson(link="log"),data=RiverProbAll)
River.glm1 |> coef() River.glm1
(Intercept) Flow TypeKayak TypeOther Flow:TypeKayak
0.081954697 -0.012241681 0.016486317 3.788587331 0.003830624
Flow:TypeOther
-0.153219124
<- glm(Problems~Flow, family=poisson(link="log"), data=RiverProb)
River.glm1a |> coef() River.glm1a
(Intercept) Flow
3.8705420 -0.1654608
= coef(River.glm1a)[1]; Beta1 = coef(River.glm1a)[2] Beta0
Hint: the values of the final parameters after 8 iterations were 1.22329 and -0.01703 for the weighted model fitted in Chapter 3.
Create the graph, add the straight line for the model using the identity link (dark blue), and then the back-transformed line created on the log scale (green).
= RiverProb |> ggplot(aes(x=Flow, y=Problems)) + geom_point() +
Graph0 xlab("Flow") + ylab("Number of problems")
+
Graph0 geom_abline(intercept = 1.22329, slope = -0.01703, col="darkblue") +
geom_function(fun=function(x) {exp(Beta0 + Beta1*x)}, col="green")