Download the R markdown file for this lecture.
In this lecture…
We will consider hypothesis tests for one-way models.
We also look at model diagnostics for these models.
Factorial models (using the treatment constraint) can be written as multiple linear regression models, using dummy variables:
\[Y_i = \mu + \alpha_2 z_{i2} + \ldots + \alpha_K z_{iK} + \varepsilon_i \quad(i=1,2,\ldots,n)\]
Why is there no \(\alpha_1 z_{i1}\)?
Despite the two models being similar in how they are expressed, there are differences in the way these types of model should be analysed.
*Importantly:
Typically the primary question of interest for a one-way model is this: is the mean response dependent on the factor?
In the factorial model (with treatment constraint) this question is addressed by testing H0: \[\alpha_2 = \alpha_3 = \cdots = \alpha_K = 0~~~~\mbox{(choose *M0*)}\] versus H1: \[\alpha_2, \alpha_3, \ldots, \alpha_K~\mbox{not all zero}~~~~\mbox{(choose *M1*)}\]
Because this is a test involving multiple parameters we will usually need to perform an F test.
Testing for the significance of the factor in a one-way model is the same as comparing the (nested) models: \[M0: Y_i = \mu + \varepsilon_i \quad (i=1,2,\ldots,n)\] and \[M1: Y_i = \mu + \alpha_2 z_{i2} + \ldots + \alpha_K z_{iK} + \varepsilon_i \quad (i=1,2,\ldots,n)\]
Retaining H0 corresponds to M0 being adequate.
Accepting H1 corresponds to M1 being better.
We can use the F tests for nested models developed earlier (since factorial models can be regarded as regression models).
The F test statistic is \[F = \frac{[RSS_{M0} - RSS_{M1}]/(K-1)}{RSS_{M1}/(n-K)}\]
Note that the model M1 contains K ‘regression’ parameters, namely \(\mu\) and \(\alpha_2, \alpha_3, \ldots, \alpha_K\) (assuming the treatment constraint).
If fobs is the observed value of the test statistic, and X is a random variable from an FK-1,n-K distribution, then the P-value is given by \(P= P(X \ge f_{\mbox{obs}})\)
You can `Download caffeine.csvto replicate the following example.
## Caffeine <- read.csv(file = "caffeine.csv", header = TRUE)
<- lm(Taps ~ Dose, data = Caffeine)
Caffeine.lm summary(Caffeine.lm)
Call:
lm(formula = Taps ~ Dose, data = Caffeine)
Residuals:
Min 1Q Median 3Q Max
-3.400 -2.075 -0.300 1.675 3.700
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 248.3000 0.7047 352.326 < 2e-16 ***
DoseLow -1.9000 0.9967 -1.906 0.06730 .
DoseZero -3.5000 0.9967 -3.512 0.00158 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.229 on 27 degrees of freedom
Multiple R-squared: 0.3141, Adjusted R-squared: 0.2633
F-statistic: 6.181 on 2 and 27 DF, p-value: 0.006163
In general, you conduct a test for factor effects using the anova()
command.
For a one-way model, R returns the omnibus F test statistic and related quantities.
anova(Caffeine.lm)
Analysis of Variance Table
Response: Taps
Df Sum Sq Mean Sq F value Pr(>F)
Dose 2 61.4 30.7000 6.1812 0.006163 **
Residuals 27 134.1 4.9667
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The observed F-statistic for testing for the significance of caffeine dose is f = 6.181 on K-1 = 2 and n-K = 27 degrees of freedom.
The corresponding P-value is P=0.0062; we therefore have very strong evidence against H0; i.e. finger tapping speed does seem to be related to caffeine dose.
The table of coefficients in the R output shows that the mean response difference between the high dose and control groups is a little more than twice as big as the difference between low dose and control group.
Use the ANOVA table produced by R to answer the following questions on the caffeine data.
What is the residual sum of squares for the full model M1?
What is the mean sum of squares for the model M1?
What is the residual sum of squares for the null model M0, given below? \[M0: Y_i = \mu + \varepsilon_i~~~~~~~(i=1,2,\ldots,30)\]
What is the mean sum of squares for the model M0?
We have tested whether the speed of students’ finger tapping is related to the caffeine dose received. As usual, the validity of our conclusions is contingent on the adequacy of the fitted model. We should therefore inspect standard model diagnostics.
par(mfrow = c(2, 2))
plot(Caffeine.lm)
hat values (leverages) are all = 0.1
and there are no factor predictors; no plot no. 5
For factorial models, it is particularly important to scrutinize the plots for heteroscedasticity (non-constant variance), when the error variance changes markedly between the factor levels.
When heteroscedasticity is present, assumption A3 fails and conclusions based on F tests become unreliable.
The diagnostic plots do not provide any reason to doubt the appropriateness of the fitted model.
The hypotheses for this test are:
*H~0~*: error variance constant across groups versus
*H~1~*: error variance not constant across groups
Rejecting the null means that the equal variance assumption is violated.
= bartlett.test(Taps ~ Dose, data = Caffeine)
Caffeine.bt Caffeine.bt
Bartlett test of homogeneity of variances
data: Taps by Dose
Bartlett's K-squared = 0.18767, df = 2, p-value = 0.9104
The P-value for this test is P=0.9104 which confirms that, for the caffeine data, we have no reason to doubt the constant variance assumption.
Comments on the Diagnostic Plots: Technical
For a factorial model, the fitted values are constant within each group. This explains the vertical lines in the left-hand plots.
A balanced factorial design exists when the same number of observations are taken at each factor level.
When the data are balanced the leverage is constant, so a plot of the leverage would be pointless.
As a result, R produces a plot of standardized residuals against factor levels for balanced factorial models in place of the usual residuals versus leverage plot.