Chapter 11 Classification of Binary outcomes

This section will be expanded to discuss classification for more than two groups.

  • Sometimes Odds Ratios can be difficult to interpret or understand.
  • Sometimes you just want to report the probability of the event occurring.
  • Or sometimes you want to predict whether or not a new individual is going to have the event.

For all of these, we need to calculate \(p_{i} = P(y_{i}=1)\), the probability of the event.

Back solving the logistic model for \(p_{i} = e^{\beta X} / (1+e^{\beta X})\) gives us the probability of an event.

\[ p_{i} = \frac{e^{\beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \ldots + \beta_{p}x_{pi}}} {1 + e^{\beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \ldots + \beta_{p}x_{pi}}} \]

Consider the main effects model of depression on age, income and gender from section 10.3.2

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.313 0.3315 -6.976 3.044e-12
sex 1.039 0.3767 2.757 0.005831

(Dispersion parameter for binomial family taken to be 1 )

Null deviance: 268.1 on 293 degrees of freedom
Residual deviance: 259.4 on 292 degrees of freedom

The predicted probability of depression is calcualted as: \[ P(depressed) = \frac{e^{-0.676 - 0.02096*age - .03656*income + 0.92945*gender}} {1 + e^{-0.676 - 0.02096*age - .03656*income + 0.92945*gender}} \]

Notice you can’t get a “generic” probability for all individuals. It entirely depends on their covariate profile. In other words, what value X take on for each record.

Let’s compare the probability of being depressed for males and females separately, while holding age and income constant at their average value.

Plug the coefficient estimates and the values of the variables into the equation and calculate. \[ P(depressed|Female) = \frac{e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(1)}} {1 + e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(1)}} \]

\[ P(depressed|Male) = \frac{e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(0)}} {1 + e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(0)}} \]

The probability for a 44.4 year old female who makes $20.6k annual income has a 0.19 probability of being depressed. The probability of depression for a male of equal age and income is 0.86.