# Chapter 11 Classification of Binary outcomes

This section will be expanded to discuss classification for more than two groups.

• Sometimes Odds Ratios can be difficult to interpret or understand.
• Sometimes you just want to report the probability of the event occurring.
• Or sometimes you want to predict whether or not a new individual is going to have the event.

For all of these, we need to calculate $$p_{i} = P(y_{i}=1)$$, the probability of the event.

Back solving the logistic model for $$p_{i} = e^{\beta X} / (1+e^{\beta X})$$ gives us the probability of an event.

$p_{i} = \frac{e^{\beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \ldots + \beta_{p}x_{pi}}} {1 + e^{\beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \ldots + \beta_{p}x_{pi}}}$

Consider the main effects model of depression on age, income and gender from section 10.3.2

Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.313 0.3315 -6.976 3.044e-12
sex 1.039 0.3767 2.757 0.005831

(Dispersion parameter for binomial family taken to be 1 )

 Null deviance: 268.1 on 293 degrees of freedom Residual deviance: 259.4 on 292 degrees of freedom

The predicted probability of depression is calcualted as: $P(depressed) = \frac{e^{-0.676 - 0.02096*age - .03656*income + 0.92945*gender}} {1 + e^{-0.676 - 0.02096*age - .03656*income + 0.92945*gender}}$

Notice you can’t get a “generic” probability for all individuals. It entirely depends on their covariate profile. In other words, what value X take on for each record.

Let’s compare the probability of being depressed for males and females separately, while holding age and income constant at their average value.

depress %>% summarize(age=mean(age), income=mean(income))
##        age   income
## 1 44.41497 20.57483

Plug the coefficient estimates and the values of the variables into the equation and calculate. $P(depressed|Female) = \frac{e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(1)}} {1 + e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(1)}}$

XB.f <- -0.676 - 0.02096*(44.4) - .03656*(20.6) + 0.92945
exp(XB.f) / (1+exp(XB.f))
##  0.1930504

$P(depressed|Male) = \frac{e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(0)}} {1 + e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(0)}}$

XB.m <- -0.676 - 0.02096*(44.4) - .03656*(20.6)
exp(XB.m) / (1+exp(XB.m))
##  0.08629312

The probability for a 44.4 year old female who makes \$20.6k annual income has a 0.19 probability of being depressed. The probability of depression for a male of equal age and income is 0.86.