# Chapter 10 Generalized Linear Models

One of the primary assumptions with linear regression, is that the error terms have a specific distribution. Namely:

$\epsilon_{i} \sim \mathcal{N}(0, \sigma^{2}) \qquad i=1, \ldots, n, \quad \mbox{and } \epsilon_{i} \perp \epsilon_{j}, i \neq j$

When your outcome variable $$y$$ is non-continuous/non-normal, the above assumption fails dramatically.

Generalized Linear Models (GLM) allows for different data type outcomes by allowing the linear portion of the model ($$\mathbf{X}\beta$$) to be related to the outcome variable $$y$$ using a link function, that allows the magnitude of the variance of the errors ($$\sigma$$) to be related to the predicted values themselves.

There are a few overarching types of non-continuous outcomes that can be modeled with GLM’s.

• Binary data: Logistic or Probit regression
• Log-linear models
• Multinomial/categorical data: Multinomial or Ordinal Logistic regression.
• Count data: Poisson regression