## 10.2 Simultaneous test of multiple variables (PMA6 9.5)

The General-F test is used for simultaneous tests of $$Q$$ variables in a model. This is used primarily in two situations:

1. Testing if a categorical variable (with more than 2 levels) as a whole improves model fit.
2. Testing whether or not the regression model is helpful in predicting values of Y at all.

Consider a model with $$P$$ variables and you want to test if $$Q$$ additional variables are useful.

• $$H_{0}: Q$$ additional variables are useless, i.e., their $$\beta$$’s all = 0
• $$H_{A}: Q$$ additional variables are useful to explain/predict $$Y$$

We can leverage the ANOVA framework to compare the residual sum of squares between the model including the $$Q$$ variables, and the one without.

$F = \frac{({\mbox{RSS}}_P-{\mbox{RSS}_{P+Q})}/Q}{{\mbox {RSS}}_{P+Q}{/}(N-P-Q-1)}$

The numerator quantifies improvement in the model from adding the additional $$Q$$ variables. This ratio has a $$F$$ distribution with $$Q$$ and $$N-P-Q-1$$ degrees of freedom.

Example Consider the following model, where $$X_{1}$$ and $$X_{2}$$ are continuous predictors and $$X_{3}, X_{4}, X_{5}$$ are binary indicators from a 4 level categorical variable.

$Y = \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + \beta_{4}x_{4} + \beta_{5}X_{5}+\epsilon_{i}$

If you wanted to test (1) whether or not the categorical variable as a whole improves model fit, then $$\mathbf{R} = \begin{bmatrix} 0 , 0 ,1,1,1 \end{bmatrix}$$

If we want to test (2) that the regression plane is useful to predict $$Y$$, then we are testing $$\beta_{1}=\beta_{2}=...\beta_{5}=0$$, then $$\mathbf{R} = \begin{bmatrix} 1 , 1 ,1,1,1 \end{bmatrix}$$.

### 10.2.1 Example: Modeling depression score

Consider a model to predict depression using age, employment status and whether or not the person was chronically ill in the past year as covariates. This example uses the cleaned depression data set.

employ.depression.model <- lm(cesd ~ age + chronill + employ, data=depress)
summary(employ.depression.model)
##
## Call:
## lm(formula = cesd ~ age + chronill + employ, data = depress)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -16.665  -5.843  -1.705   3.155  33.576
##
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)
## (Intercept)       11.48255    1.50174   7.646 3.19e-13 ***
## age               -0.13300    0.03514  -3.785 0.000187 ***
## chronill           2.68759    1.02368   2.625 0.009121 **
## employHouseperson  6.75036    1.79661   3.757 0.000208 ***
## employIn School    1.96663    5.99549   0.328 0.743138
## employOther        4.89731    4.27848   1.145 0.253320
## employPT           3.25937    1.47240   2.214 0.027645 *
## employRetired      3.23316    1.88602   1.714 0.087565 .
## employUnemp        7.63163    2.33915   3.263 0.001238 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.385 on 285 degrees of freedom
## Multiple R-squared:  0.1217, Adjusted R-squared:  0.09704
## F-statistic: 4.936 on 8 and 285 DF,  p-value: 9.86e-06

The results of this model show that age and chronic illness are statistically associated with CESD (each p<.006). However employment status shows mixed results. Some employment statuses are significantly different from the reference group, some are not. So overall, is employment status associated with depression?

(1) Testing if a categorical variable as a whole improves model fit

Since employment is a categorical variable, all the coefficient estimates shown are the effect of being in that income category has on depression compared to being employed full time. For example, the coefficient for PT employment is greater than zero, so they have a higher CESD score compared to someone who is fully employed.

To test that employment status affects CESD we need to do a global test that all $$\beta$$’s related to employment status are 0.

$$H_{0}: \beta_{3} = \beta_{4} = \beta_{5} = \beta_{6} = \beta_{7} = \beta_{8} = 0$$
$$H_{A}$$: At least one $$\beta_{j}$$ is not 0.

ANOVA to the rescue! Since ANOVA partitions the variance in our outcome $$Y$$ into amounts due to each variable, we get an ANOVA table that has one row per term:

aov(employ.depression.model) %>% summary()
##              Df Sum Sq Mean Sq F value   Pr(>F)
## age           1    615   614.6   8.743 0.003368 **
## chronill      1    409   409.2   5.821 0.016469 *
## employ        6   1752   292.0   4.154 0.000509 ***
## Residuals   285  20036    70.3
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
• The last row for employ is what we are interested in here.
• First confirm that the degrees of freedom are correct. It should equal the # of categories in the variable you are testing, minus 1.
• Employment has 7 levels, so $$df=6$$.
• Or equivalently, the degrees of freedom are the number of $$beta$$’s you are testing to be 0.

The p-value of this Wald test is significant, thus not $$beta$$’s are equal to zero, which implies employment status significantly predicts CESD score.

Note the p-values for the individual coefficients age and chronill are not the same as in the regression model. ANOVA models are order dependent as they describe a “reduction” in variance in the outcome due to that variable. A deeper explanation of this is not included in these notes at this time.

(2) Testing that the regression plane is useful to predict $$Y$$

This information is provided to us directly in the last line of the summary output from a linear model.

summary(employ.depression.model)
##
## Call:
## lm(formula = cesd ~ age + chronill + employ, data = depress)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -16.665  -5.843  -1.705   3.155  33.576
##
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)
## (Intercept)       11.48255    1.50174   7.646 3.19e-13 ***
## age               -0.13300    0.03514  -3.785 0.000187 ***
## chronill           2.68759    1.02368   2.625 0.009121 **
## employHouseperson  6.75036    1.79661   3.757 0.000208 ***
## employIn School    1.96663    5.99549   0.328 0.743138
## employOther        4.89731    4.27848   1.145 0.253320
## employPT           3.25937    1.47240   2.214 0.027645 *
## employRetired      3.23316    1.88602   1.714 0.087565 .
## employUnemp        7.63163    2.33915   3.263 0.001238 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.385 on 285 degrees of freedom
## Multiple R-squared:  0.1217, Adjusted R-squared:  0.09704
## F-statistic: 4.936 on 8 and 285 DF,  p-value: 9.86e-06

### 10.2.2 Testing for a moderation effect in a multiple regression model.

Moderation is introduced in Chapter 8, and helps to set the motivation for stratified models. Later, in Chapter 10.1.2, we show that an interaction term in a regression model is equivelant to stratification.

Well what if you have other predictors in the model, not just the ones that you have an interaction on? We can use the Wald test to assess if a measure is a significant moderator without stratifying.

Continuing with the depression example we saw that employment affects CESD depression score. What if we think that the effect (slope) of age on CESD may be different depending on their employment? That is, is the effect of age on depression different for those that are employed versus retired?

emp.dep.intx <- lm(cesd ~ age + chronill + employ + age*employ, data=depress)
summary(emp.dep.intx)
##
## Call:
## lm(formula = cesd ~ age + chronill + employ + age * employ, data = depress)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -18.728  -5.499  -1.900   3.251  33.973
##
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)
## (Intercept)            10.514408   2.027479   5.186 4.14e-07 ***
## age                    -0.108148   0.050325  -2.149  0.03250 *
## chronill                2.716041   1.035393   2.623  0.00919 **
## employHouseperson      13.136141   5.279521   2.488  0.01343 *
## employIn School       -39.366285  42.218614  -0.932  0.35192
## employOther             5.099073  18.815341   0.271  0.78659
## employPT                2.742451   3.850676   0.712  0.47694
## employRetired          -2.851372  12.739350  -0.224  0.82306
## employUnemp            13.820668   5.677595   2.434  0.01555 *
## age:employHouseperson  -0.130545   0.102421  -1.275  0.20351
## age:employIn School     1.988808   1.989220   1.000  0.31828
## age:employOther        -0.009316   0.386657  -0.024  0.98080
## age:employPT            0.008960   0.085254   0.105  0.91638
## age:employRetired       0.073942   0.182591   0.405  0.68582
## age:employUnemp        -0.174615   0.146179  -1.195  0.23328
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.408 on 279 degrees of freedom
## Multiple R-squared:  0.1354, Adjusted R-squared:  0.09201
## F-statistic: 3.121 on 14 and 279 DF,  p-value: 0.0001511

• $$\beta_{1}$$: Age
• $$\beta_{2}$$: Chronic illness
• $$\beta_{3} \ldots \beta_{7}$$: Effects of different levels of employment (Houseperson to Unemployed)
• $$\beta_{8} \ldots \beta_{12}$$: Multiplicative effect that levels of employment have on the slope of age.

To see if the interaction term age*employ is significant, we run an F test via aov() and interpret the p-value for the interaction term age:employ. Here the pvalue is very large, so there is no reason to believe that employment moderates the relationship between age and CESD score. This is a two way relationship. There is also no reason to believe that age moderates the relationship between employment and CESD score.

aov(emp.dep.intx) |> summary()
##              Df Sum Sq Mean Sq F value   Pr(>F)
## age           1    615   614.6   8.694 0.003462 **
## chronill      1    409   409.2   5.789 0.016781 *
## employ        6   1752   292.0   4.131 0.000541 ***
## age:employ    6    313    52.1   0.737 0.620089
## Residuals   279  19723    70.7
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This last table is also known as a “Two Factor” or “Two way” ANOVA with an interaction term. This is quite often used in scientific experiments where two treatments (and their combination) is being investigated.