Practical Computing Exercise for Week 8 :Fairness in the use of the Death Penalty Solutions

Aims of this practical exercise

In this exercise you will:

  • fit a log-linear model.
  • compare the outcome for a log-linear model with that of a logistic regression.

The exercise

Agresti (2007) presented an example on the counts of murderers given the death penalty in the state of Florida. The data can be used to judge the level of racial prejudice that might exist in the justice system as the race of the murderer and their victims are noted. Use a log-linear model to determine the appropriate model for this scenario and draw a conclusion on the possible prejudice that may exist. Do your results match those that could be found using a logistic regression approach?

The data can be obtained using:

    data(DeathPenalty, package="ELMER") 
str(DeathPenalty)
'data.frame':   4 obs. of  4 variables:
 $ Victim   : Factor w/ 2 levels "Black","White": 2 2 1 1
 $ Defendant: Factor w/ 2 levels "Black","White": 2 1 2 1
 $ Yes      : int  53 11 0 4
 $ No       : int  414 37 16 139

The solution

We need to reshape the data using:

    library(tidyr)
DeathPenalty2  =     DeathPenalty |> pivot_longer(Yes:No, names_to="DP", values_to="Count") |> glimpse()
Rows: 8
Columns: 4
$ Victim    <fct> White, White, White, White, Black, Black, Black, Black
$ Defendant <fct> White, White, Black, Black, White, White, Black, Black
$ DP        <chr> "Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No"
$ Count     <int> 53, 414, 11, 37, 0, 16, 4, 139

before fitting the log-linear model.

DP2.glm <- glm(Count~Victim*Defendant*DP, data=DeathPenalty2, family=poisson())
DP2.glm |> summary()

Call:
glm(formula = Count ~ Victim * Defendant * DP, family = poisson(), 
    data = DeathPenalty2)

Deviance Residuals: 
[1]  0  0  0  0  0  0  0  0

Coefficients:
                                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)                       4.934e+00  8.482e-02  58.177  < 2e-16 ***
VictimWhite                      -1.324e+00  1.850e-01  -7.155 8.38e-13 ***
DefendantWhite                   -2.162e+00  2.640e-01  -8.189 2.63e-16 ***
DPYes                            -3.548e+00  5.071e-01  -6.996 2.63e-12 ***
VictimWhite:DefendantWhite        4.577e+00  3.149e-01  14.536  < 2e-16 ***
VictimWhite:DPYes                 2.335e+00  6.125e-01   3.813 0.000137 ***
DefendantWhite:DPYes             -2.153e+01  4.225e+04  -0.001 0.999593    
VictimWhite:DefendantWhite:DPYes  2.068e+01  4.225e+04   0.000 0.999609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 1.2251e+03  on 7  degrees of freedom
Residual deviance: 4.1226e-10  on 0  degrees of freedom
AIC: 54.04

Number of Fisher Scoring iterations: 20
DP2.glm|> anova(test="Chisq")
Analysis of Deviance Table

Model: poisson, link: log

Response: Count

Terms added sequentially (first to last)

                    Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
NULL                                    7    1225.08              
Victim               1   197.93         6    1027.15 < 2.2e-16 ***
Defendant            1   130.79         5     896.36 < 2.2e-16 ***
DP                   1   493.52         4     402.84 < 2.2e-16 ***
Victim:Defendant     1   380.57         3      22.27 < 2.2e-16 ***
Victim:DP            1    16.87         2       5.39 3.999e-05 ***
Defendant:DP         1     5.01         1       0.38   0.02514 *  
Victim:Defendant:DP  1     0.38         0       0.00   0.53769    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Use the original data for the logistic regression model.

DP.glm <- glm(cbind(Yes,No)~Victim*Defendant, data=DeathPenalty,  family=binomial())
DP.glm |> summary()

Call:
glm(formula = cbind(Yes, No) ~ Victim * Defendant, family = binomial(), 
    data = DeathPenalty)

Deviance Residuals: 
[1]  0  0  0  0

Coefficients:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   -3.5482     0.5071  -6.996 2.63e-12 ***
VictimWhite                    2.3352     0.6125   3.813 0.000137 ***
DefendantWhite               -21.9957 53403.2302   0.000 0.999671    
VictimWhite:DefendantWhite    21.1531 53403.2302   0.000 0.999684    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 2.2266e+01  on 3  degrees of freedom
Residual deviance: 2.5803e-10  on 0  degrees of freedom
AIC: 20.92

Number of Fisher Scoring iterations: 22
DP.glm|> anova(test="Chisq")
Analysis of Deviance Table

Model: binomial, link: logit

Response: cbind(Yes, No)

Terms added sequentially (first to last)

                 Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
NULL                                 3    22.2659              
Victim            1  16.8719         2     5.3940 3.999e-05 ***
Defendant         1   5.0142         1     0.3798   0.02514 *  
Victim:Defendant  1   0.3798         0     0.0000   0.53769    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1