# Chapter 11 Generalized Linear Models

One of the primary assumptions with linear regression, is that the error terms have a specific distribution. Namely:

$\epsilon_{i} \sim \mathcal{N}(0, \sigma^{2}) \qquad i=1, \ldots, n, \quad \mbox{and } \epsilon_{i} \perp \epsilon_{j}, i \neq j$

When your outcome variable $$y$$ is non-continuous/non-normal, the above assumption fails dramatically.

Generalized Linear Models (GLM) allows for different data type outcomes by allowing the linear portion of the model ($$\mathbf{X}\beta$$) to be related to the outcome variable $$y$$ using a link function, that allows the magnitude of the variance of the errors ($$\sigma$$) to be related to the predicted values themselves.

There are a few overarching types of non-continuous outcomes that can be modeled with GLM’s.

• Binary data: Logistic or Probit regression (11.3)
• Log-linear models (11.4)
• Multinomial/categorical data: Multinomial or Ordinal Logistic regression. (11.6)
• Count data: Poisson regression (11.5)
This section uses functions from the following additional packages: gtsummary,MKmisc, survey.