In this exercise you will:
A sociological experiment examined the way racial descent and gender influenced people’s helpfulness towards a stranger. The data, a \(2\times2\times2\times2\) array, is shown in the table below.
Requestor | Respondents | ||||||||
Female | Male | Total | |||||||
Help | Refuse | Total | Help | Refuse | Total | Help | Refuse | Total | |
English | |||||||||
females | 23 | 0 | 23 | 24 | 3 | 27 | 47 | 3 | 50 |
males | 20 | 4 | 24 | 21 | 5 | 26 | 41 | 9 | 50 |
Asian | |||||||||
females | 25 | 2 | 27 | 17 | 11 | 28 | 42 | 13 | 55 |
males | 9 | 15 | 24 | 21 | 5 | 26 | 30 | 20 | 50 |
Students of similar age and dressed alike approached strangers in a busy shopping precinct and requested change for a phone call. If the stranger provided or looked for change the response was counted as helpful. Not replying or not looking were counted as unhelpful. The stranger’s gender was also noted. The data can be obtained using:
The students were either Asian or English, males or females.
What are the explanatory and response variables?
What is the minimal model for a Poisson/log analysis?
Starting with the minimal model add interactions until the deviance drops to a value consistent with random variation. Give an interpretation of this model.
Does your model make sense when you look just at the proportions in the table? In other words, how well could you have predicted the model without formal analysis?
Ethnicity and gender are explanatory variables; only the
helpfulness is a response variable. All are treated equally in the
glm()
though, but that is just the way we fit a log-linear
model.
The starting point is the model with the four main effects.
Helpful.min <- glm(Count ~ QRace + QGender + AGender + AHelp, data=Helpful, family=poisson)
summary(Helpful.min)
Call:
glm(formula = Count ~ QRace + QGender + AGender + AHelp, family = poisson,
data = Helpful)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.99903 0.14446 20.761 < 2e-16 ***
QRaceEnglish -0.04879 0.13973 -0.349 0.727
QGendermale -0.04879 0.13973 -0.349 0.727
AGendermale 0.08786 0.13982 0.628 0.530
AHelpRefuse -1.26851 0.16874 -7.518 5.58e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 110.137 on 15 degrees of freedom
Residual deviance: 41.087 on 11 degrees of freedom
AIC: 114.15
Number of Fisher Scoring iterations: 5
step()
is an efficient way to find a decent
model.Helpful.step <- step(Helpful.min, scope=.~QRace * QGender *AGender * AHelp, test="Chisq", direction="forward")
Start: AIC=114.15
Count ~ QRace + QGender + AGender + AHelp
Df Deviance AIC LRT Pr(>Chi)
+ QRace:AHelp 1 29.415 104.48 11.6716 0.0006346 ***
+ QGender:AHelp 1 35.370 110.43 5.7169 0.0168019 *
<none> 41.087 114.15
+ QRace:QGender 1 40.971 116.03 0.1162 0.7331687
+ QRace:AGender 1 41.036 116.10 0.0507 0.8218613
+ AGender:AHelp 1 41.057 116.12 0.0300 0.8625987
+ QGender:AGender 1 41.084 116.15 0.0030 0.9564728
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Step: AIC=104.48
Count ~ QRace + QGender + AGender + AHelp + QRace:AHelp
Df Deviance AIC LRT Pr(>Chi)
+ QGender:AHelp 1 23.698 100.76 5.7169 0.0168 *
<none> 29.415 104.48
+ QRace:QGender 1 29.299 106.36 0.1162 0.7332
+ QRace:AGender 1 29.364 106.43 0.0507 0.8219
+ AGender:AHelp 1 29.385 106.45 0.0300 0.8626
+ QGender:AGender 1 29.412 106.48 0.0030 0.9565
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Step: AIC=100.76
Count ~ QRace + QGender + AGender + AHelp + QRace:AHelp + QGender:AHelp
Df Deviance AIC LRT Pr(>Chi)
<none> 23.698 100.76
+ QRace:QGender 1 22.817 101.88 0.88093 0.3479
+ QRace:AGender 1 23.648 102.71 0.05069 0.8219
+ AGender:AHelp 1 23.668 102.73 0.02995 0.8626
+ QGender:AGender 1 23.695 102.76 0.00298 0.9565
Call:
glm(formula = Count ~ QRace + QGender + AGender + AHelp + QRace:AHelp +
QGender:AHelp, family = poisson, data = Helpful)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.95209 0.15557 18.976 < 2e-16 ***
QRaceEnglish 0.20067 0.15891 1.263 0.20666
QGendermale -0.22596 0.15912 -1.420 0.15561
AGendermale 0.08786 0.13982 0.628 0.52975
AHelpRefuse -1.22769 0.29909 -4.105 4.05e-05 ***
QRaceEnglish:AHelpRefuse -1.21227 0.37268 -3.253 0.00114 **
QGendermale:AHelpRefuse 0.82066 0.34972 2.347 0.01894 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 110.137 on 15 degrees of freedom
Residual deviance: 23.698 on 9 degrees of freedom
AIC: 100.76
Number of Fisher Scoring iterations: 5
Analysis of Deviance Table
Model: poisson, link: log
Response: Count
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 15 110.137
QRace 1 0.122 14 110.015 0.7269148
QGender 1 0.122 13 109.894 0.7269148
AGender 1 0.395 12 109.498 0.5295531
AHelp 1 68.411 11 41.087 < 2.2e-16 ***
QRace:AHelp 1 11.672 10 29.415 0.0006346 ***
QGender:AHelp 1 5.717 9 23.698 0.0168019 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1