Lecture 10: Testing in Multiple Regression Models (1)

Download the R markdown file for this lecture.

In a multiple regression model there are a variety of hypotheses that we might wish to test. For example:

Is the mean response related to any of the predictors?
Does a given predictor provide additional information about the response over and above that provided by the other predictors?

In this lecture we shall look at the methodology for testing question 1 above.

The F Test for Overall Fit of a Multiple Regression Model

For any given model: \(Y_i = \beta_0 + \beta_1 x_{i1} + \ldots + \beta_p x_{ip} + \varepsilon_i\) where i=1,2,…,n.

We ask: Is the mean response (linearly) related to any of the predictors?

We test:

H₀: \(\beta_1 = \beta_2 = \ldots = \beta_p = 0\) i.e. mean response is not linearly related to any of the predictors

against

H₁: \(\beta_1, \beta_2, \ldots, \beta_p~\mbox{not all zero}\) i.e. mean response is linearly related to at least one of the predictors.

Testing H₀ versus H₁ is equivalent to comparing two different models:

M0: \(Y_i = \beta_0 + \varepsilon_i\) and

M1: \(Y_i = \beta_0 + \beta_1 x_{i1} + \ldots + \beta_p x_{ip} + \varepsilon_i\) where i=1,2,…,n.

Model M0 corresponds to H₀.

Model M1 corresponds to H₁.

General Thoughts on Choosing Between Models

In choosing between models, statisticians have two aims:

to choose a simple (i.e. not too complex) model;
to choose a model that fits the data well.

We can measure the complexity of a linear regression model by the number of regression parameters, p+1. The greater this value, the more complex the model.

We can measure the closeness of fit of the model to data using the residual sum of squares.

Think of model comparison like clothes shopping — is it worth spending more (parameters) in order to get a better (model) fit?

F-Tests for Overall Fit (again)

We want to compare:

Model M0 — cheap (just 1 regression parameter) but may fit badly (residual sum of squares, RSS_M0, may be large).

Model M1 — more expensive (p+1 regression parameters) but will fit better (residual sum of squares, RSS_M1, smaller).

So we calculate residual sum of squares (RSS) for each model and compute the following F test statistic:

\[F = \frac{[RSS_{M0} - RSS_{M1}]/p}{RSS_{M1}/(n-p-1)}\]

Intuitively, this F statistic measures (a standardised version of) improvement in fit per “unit cost” (i.e. per extra parameter).

Large values of F suggests that we should prefer M1 to M0. Intuitively, improvement in fit of model is worth the cost.

How large is “large”?

If model M0 is correct (i.e. H₀ is correct) then the F test statistic has an F distribution with p,(n-p-1) degrees of freedom, often denoted F_p,n-p-1.

We use these facts to calculate the P-value for the F statistic, and hence test H₀ versus H₁.

Aside: The F Distribution

An F distribution is defined by two degrees of freedom.

Random variables from the F distribution take only non-negative values.

Some examples of the density of various F distributions are displayed. The shape depends on the numerator and denominator degrees of freedom…

Probability density functions for F distributions with numerator df of 3 (left) and 10 (right), and denominator df of (10 (upper) and 30 (lower)

… but large values of x are always unlikely to be observed by chance alone.

Analysis of Paramo Regression by Hand

For the paramo biodiversity data, consider the models M0:

\[E[\mbox{N}] = \beta_0\] and M1: \[E[\mbox{N}] = \beta_0 + \beta_1 \mbox{AR} + \beta_2 \mbox{EL} + \beta_3 \mbox{DEc} + \beta_4 \mbox{DNI}\]

We want to test H₀: \(\beta_1 = \beta_2 = \beta_3 = \beta_4 = 0\) that is, M0 correct; against H₁: \(\beta_1, \beta_2, \beta_3, \beta_4\) not all zero — i.e. M1 is better.

Calculations give RSS_M0 = 1498.9 and RSS_M1 = 404.6.

We also need to know that p=4 and n=14 in our context.

The F test statistic is

\[F = \frac{[RSS_{M0} - RSS_{M1}]/p}{RSS_{M1}/(n-p-1)} = \frac{[1499.5 - 404.6]/4}{404.6/(9)} = 6.09\]

The Corresponding P-value is right hand tail probability:

\(P(X \ge 6.09) = 0.012\) where \(X \sim F_{4,9}\)

Our Conclusion: the data provide evidence that the number of bird species depends on at least one of the explanatory variables AR, EL, DEc, or DNI.

Omnibus F Test Statistics in R

The test of whether the response is related to any of the explanatory variables is sometimes called an omnibus F test.

We do not have to do the test by hand — R provides the F statistic and corresponding P-value for this test as a standard part of the summary() output for a linear model.

Back to Paramo Example

Download paramo.csv

## Paramo <- read.csv(file = "https://r-resources.massey.ac.nz/161221/data/paramo.csv", 
##     header = TRUE, row.names = 1)
Paramo.lm <- lm(N ~ ., data = Paramo)
Paramo.lm.sum <- summary(Paramo.lm)
Paramo.lm.sum


Call:
lm(formula = N ~ ., data = Paramo)

Residuals:
     Min       1Q   Median       3Q      Max 
-10.6660  -3.4090   0.0834   3.5592   8.2357 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept) 27.889386   6.181843   4.511  0.00146 **
AR           5.153864   3.098074   1.664  0.13056   
EL           3.075136   4.000326   0.769  0.46175   
DEc         -0.017216   0.005243  -3.284  0.00947 **
DNI          0.016591   0.077573   0.214  0.83541   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.705 on 9 degrees of freedom
Multiple R-squared:  0.7301,    Adjusted R-squared:  0.6101 
F-statistic: 6.085 on 4 and 9 DF,  p-value: 0.01182

names(Paramo.lm.sum)

 [1] "call"          "terms"         "residuals"     "coefficients" 
 [5] "aliased"       "sigma"         "df"            "r.squared"    
 [9] "adj.r.squared" "fstatistic"    "cov.unscaled"

Paramo.lm.sum$r.squared

[1] 0.730068

Paramo.lm.sum$fstatistic

   value    numdf    dendf 
6.085434 4.000000 9.000000

F statistic is 6.085 and corresponding P-value is 0.012 , agreeing with by hand calculations.