12.1 Predicted Probabilities

  • Sometimes Odds Ratios can be difficult to interpret or understand.
  • Sometimes you just want to report the probability of the event occurring.
  • Or sometimes you want to predict whether or not a new individual is going to have the event.

For all of these, we need to calculate \(p_{i} = P(y_{i}=1)\), the probability of the event. Back solving the logistic model for \(p_{i} = e^{\beta X} / (1+e^{\beta X})\) gives us the probability of an event.

\[ p_{i} = \frac{e^{\beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \ldots + \beta_{p}x_{pi}}} {1 + e^{\beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \ldots + \beta_{p}x_{pi}}} \]

Consider the main effects model of depression on age, income and sex from section 11.4.3

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.6765 0.5788 -1.169 0.2425
age -0.02096 0.00904 -2.318 0.02043
income -0.03656 0.01409 -2.595 0.009457
sexFemale 0.9294 0.3858 2.409 0.016

(Dispersion parameter for binomial family taken to be 1 )

Null deviance: 268.1 on 293 degrees of freedom
Residual deviance: 247.5 on 290 degrees of freedom

The predicted probability of depression is calculated as: \[ P(depressed) = \frac{e^{-0.676 - 0.02096*age - .03656*income + 0.92945*sex}} {1 + e^{-0.676 - 0.02096*age - .03656*income + 0.92945*sex}} \]

Notice this formulation requires you to specify a covariate profile. In other words, what value X take on for each record. Often when you are only concerned with comparing the effect of a single measures, you set all other measures equal to their means.

Let’s compare the probability of being depressed for males and females separately, while holding age and income constant at the average value calculated across all individuals (regardless of sex).

depress %>% summarize(age=mean(age), income=mean(income))
##        age   income
## 1 44.41497 20.57483

Plug the coefficient estimates and the values of the variables into the equation and calculate. \[ P(depressed|Female) = \frac{e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(1)}} {1 + e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(1)}} \]

XB.f <- -0.676 - 0.02096*(44.4) - .03656*(20.6) + 0.92945
exp(XB.f) / (1+exp(XB.f))
## [1] 0.1930504

\[ P(depressed|Male) = \frac{e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(0)}} {1 + e^{-0.676 - 0.02096(44.4) - .03656(20.6) + 0.92945(0)}} \]

XB.m <- -0.676 - 0.02096*(44.4) - .03656*(20.6)
exp(XB.m) / (1+exp(XB.m))
## [1] 0.08629312

The probability for a 44.4 year old female who makes $20.6k annual income has a 0.19 probability of being depressed. The probability of depression for a male of equal age and income is 0.086.