7.2 Mathematical Model
The mathematical model that we use for regression has three features.
- Y values are normally distributed at any given X
- The mean of Y values at any given X follows a straight line Y=β0+β1X.
- The variance of Y values at any X is σ2 (same for all X). This is known as homoscedasticity, or homogeneity of variance.
Mathematically this is written as:
Y|X∼N(μY|X,σ2)μY|X=β0+β1XVar(Y|X)=σ2
and can be visualized as:

Figure 6.2
7.2.1 Unifying model framework
The mathematical model above describes the theoretical relationship between Y and X. So in our unifying model framework to describe observed data,
DATA = MODEL + RESIDUAL
Our observed data values yi can be modeled as being centered on μY|X, with normally distributed residuals.
yi=β0+β1X+ϵiϵi∼N(0,σ2)