10.2 Wald test
The Wald test is used for simultaneous tests of \(Q\) variables in a model. This is used primarily in two situations:
- Testing if a categorical variable (with more than 2 levels) as a whole improves model fit.
- Testing whether or not the regression model is helpful in predicting values of Y at all.
Consider a model with \(P\) variables and you want to test if \(Q\) additional variables are useful.
- \(H_{0}: Q\) additional variables are useless, i.e., their \(\beta\)’s all = 0
- \(H_{A}: Q\) additional variables are useful to explain/predict \(Y\)
The traditional test statistic that we’ve seen since Intro stats is \(\frac{\hat{\theta}-\theta}{\sqrt{Var(\hat{\theta})}}\), and generally has a \(Z\) or \(T\) distribution.
The Wald test generalizes this test any linear combination of predictors.
\[ (R\hat{\theta}_{n}-r)^{'}[R({\hat{V}}_{n}/n)R^{'}]^{-1} (R\hat{\theta}_{n}-r) \quad \xrightarrow{\mathcal{D}} \quad F(Q,n-P) \]
Where \(\mathbf{R}\) is the vector of coefficients for the \(\beta\), and \(\hat{V}_{n}\) is a consistent estimator of the covariance matrix. Instead of a normal distribution, this test statistic has an \(F\) distribution with \(Q\) and \(n-P\) degrees of freedom.
Example Consider the following model, where \(X_{1}\) and \(X_{2}\) are continuous predictors and \(X_{3}, X_{4}, X_{5}\) are binary indicators from a 4 level categorical variable.
\[ Y = \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + \beta_{4}x_{4} + \beta_{5}X_{5}+\epsilon_{i} \]
If you wanted to test (1) whether or not the categorical variable as a whole improves model fit, then \(\mathbf{R} = \begin{bmatrix} 0 , 0 ,1,1,1 \end{bmatrix}\)
If we want to test (2) that the regression plane is useful to predict \(Y\), then we are testing \(\beta_{1}=\beta_{2}=...\beta_{5}=0\), then \(\mathbf{R} = \begin{bmatrix} 1 , 1 ,1,1,1 \end{bmatrix}\).
10.2.1 Example: Modeling depression score
Consider a model to predict depression using age, employment status and whether or not the person was chronically ill in the past year as covariates. This example uses the cleaned depression data set.
employ.depression.model <- lm(cesd ~ age + chronill + employ, data=depress)
summary(employ.depression.model)
##
## Call:
## lm(formula = cesd ~ age + chronill + employ, data = depress)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.665 -5.843 -1.705 3.155 33.576
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.48255 1.50174 7.646 3.19e-13 ***
## age -0.13300 0.03514 -3.785 0.000187 ***
## chronill 2.68759 1.02368 2.625 0.009121 **
## employHouseperson 6.75036 1.79661 3.757 0.000208 ***
## employIn School 1.96663 5.99549 0.328 0.743138
## employOther 4.89731 4.27848 1.145 0.253320
## employPT 3.25937 1.47240 2.214 0.027645 *
## employRetired 3.23316 1.88602 1.714 0.087565 .
## employUnemp 7.63163 2.33915 3.263 0.001238 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.385 on 285 degrees of freedom
## Multiple R-squared: 0.1217, Adjusted R-squared: 0.09704
## F-statistic: 4.936 on 8 and 285 DF, p-value: 9.86e-06
The results of this model show that age and chronic illness are statistically associated with CESD (each p<.006). However employment status shows mixed results. Some employment statuses are significantly different from the reference group, some are not. So overall, is employment status associated with depression?
(1) Testing if a categorical variable as a whole improves model fit
Since employment is a categorical variable, all the coefficient estimates shown are the effect of being in that income category has on depression compared to being employed full time. For example, the coefficient for PT employment is greater than zero, so they have a higher CESD score compared to someone who is fully employed.
To test that employment status affects CESD we need to do a global test that all \(\beta\)’s related to employment status are 0.
\(H_{0}: \beta_{3} = \beta_{4} = \beta_{5} = \beta_{6} = \beta_{7} = \beta_{8} = 0\)
\(H_{A}\): At least one \(\beta_{j}\) is not 0.
ANOVA to the rescue! Since ANOVA partitions the variance in our outcome \(Y\) into amounts due to each variable, we get an ANOVA table that has one row per term:
aov(employ.depression.model) %>% summary()
## Df Sum Sq Mean Sq F value Pr(>F)
## age 1 615 614.6 8.743 0.003368 **
## chronill 1 409 409.2 5.821 0.016469 *
## employ 6 1752 292.0 4.154 0.000509 ***
## Residuals 285 20036 70.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
- The last row for
employ
is what we are interested in here. - First confirm that the degrees of freedom are correct. It should equal the # of categories in the variable you are testing, minus 1.
- Employment has 7 levels, so \(df=6\).
- Or equivalently, the degrees of freedom are the number of \(beta\)’s you are testing to be 0.
The p-value of this Wald test is significant, thus not \(beta\)’s are equal to zero, which implies employment status significantly predicts CESD score.
age
and chronill
are not the same as in the regression model. ANOVA models are order dependent as they describe a “reduction” in variance in the outcome due to that variable. A deeper explanation of this is not included in these notes at this time.
(2) Testing that the regression plane is useful to predict \(Y\)
This information is provided to us directly in the last line of the summary output from a linear model.
summary(employ.depression.model)
##
## Call:
## lm(formula = cesd ~ age + chronill + employ, data = depress)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.665 -5.843 -1.705 3.155 33.576
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.48255 1.50174 7.646 3.19e-13 ***
## age -0.13300 0.03514 -3.785 0.000187 ***
## chronill 2.68759 1.02368 2.625 0.009121 **
## employHouseperson 6.75036 1.79661 3.757 0.000208 ***
## employIn School 1.96663 5.99549 0.328 0.743138
## employOther 4.89731 4.27848 1.145 0.253320
## employPT 3.25937 1.47240 2.214 0.027645 *
## employRetired 3.23316 1.88602 1.714 0.087565 .
## employUnemp 7.63163 2.33915 3.263 0.001238 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.385 on 285 degrees of freedom
## Multiple R-squared: 0.1217, Adjusted R-squared: 0.09704
## F-statistic: 4.936 on 8 and 285 DF, p-value: 9.86e-06
10.2.2 Testing for a moderation effect in a multiple regression model.
Moderation is introduced in Chapter 8, and helps to set the motivation for stratified models. Later, in Chapter 10.1.2, we show that an interaction term in a regression model is equivelant to stratification.
Well what if you have other predictors in the model, not just the ones that you have an interaction on? We can use the Wald test to assess if a measure is a significant moderator without stratifying.
Continuing with the depression example we saw that employment affects CESD depression score. What if we think that the effect (slope) of age on CESD may be different depending on their employment? That is, is the effect of age on depression different for those that are employed versus retired?
emp.dep.intx <- lm(cesd ~ age + chronill + employ + age*employ, data=depress)
summary(emp.dep.intx)
##
## Call:
## lm(formula = cesd ~ age + chronill + employ + age * employ, data = depress)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.728 -5.499 -1.900 3.251 33.973
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.514408 2.027479 5.186 4.14e-07 ***
## age -0.108148 0.050325 -2.149 0.03250 *
## chronill 2.716041 1.035393 2.623 0.00919 **
## employHouseperson 13.136141 5.279521 2.488 0.01343 *
## employIn School -39.366285 42.218614 -0.932 0.35192
## employOther 5.099073 18.815341 0.271 0.78659
## employPT 2.742451 3.850676 0.712 0.47694
## employRetired -2.851372 12.739350 -0.224 0.82306
## employUnemp 13.820668 5.677595 2.434 0.01555 *
## age:employHouseperson -0.130545 0.102421 -1.275 0.20351
## age:employIn School 1.988808 1.989220 1.000 0.31828
## age:employOther -0.009316 0.386657 -0.024 0.98080
## age:employPT 0.008960 0.085254 0.105 0.91638
## age:employRetired 0.073942 0.182591 0.405 0.68582
## age:employUnemp -0.174615 0.146179 -1.195 0.23328
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.408 on 279 degrees of freedom
## Multiple R-squared: 0.1354, Adjusted R-squared: 0.09201
## F-statistic: 3.121 on 14 and 279 DF, p-value: 0.0001511
Let’s revisit our list of beta coefficients:
- \(\beta_{1}\): Age
- \(\beta_{2}\): Chronic illness
- \(\beta_{3} \ldots \beta_{7}\): Effects of different levels of employment (Houseperson to Unemployed)
- \(\beta_{8} \ldots \beta_{12}\): Multiplicative effect that levels of employment have on the slope of age.
To see if the interaction term age*employ
is significant, we run an F test via aov()
and interpret the p-value for the interaction term age:employ
. Here the pvalue is very large, so there is no reason to believe that employment moderates the relationship between age and CESD score. This is a two way relationship. There is also no reason to believe that age moderates the relationship between employment and CESD score.
aov(emp.dep.intx) |> summary()
## Df Sum Sq Mean Sq F value Pr(>F)
## age 1 615 614.6 8.694 0.003462 **
## chronill 1 409 409.2 5.789 0.016781 *
## employ 6 1752 292.0 4.131 0.000541 ***
## age:employ 6 313 52.1 0.737 0.620089
## Residuals 279 19723 70.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1