8.3 Parameter Estimation

The goal of regression analysis is to minimize the residual error. That is, to minimize the difference between the value of the dependent variable predicted by the model and the true value of the dependent variable.

\[ \epsilon_{i} = \hat{y_{i}} - y_{i}\]

The method of Least Squares accomplishes this by finding parameter estimates \(\beta_{0}\) and \(\beta_{1}\) that minimized the sum of the squared residuals:

\[ \sum_{i=1}^{n} \epsilon_{i} \]

For simple linear regression the regression coefficient estimates that minimize the sum of squared errors can be calculated as: \[ \hat{\beta_{0}} = \bar{y} - \hat{\beta_{1}}\bar{x} \quad \mbox{ and } \quad \hat{\beta_{1}} = r\frac{s_{y}}{s_{x}} \]

For multiple linear regression, the fitted values \(\hat{y_{i}}\) are calculated as the linear combination of x’s and \(\beta\)’s, \(\sum_{i=1}^{p}X_{ij}\beta_{j}\). The sum of the squared residual errors (the distance between the observed point \(y_{i}\) and the fitted value) now has the following form:

\[ \sum_{i=1}^{n} |y_{i} - \sum_{i=1}^{p}X_{ij}\beta_{j}|^{2}\]

Or in matrix notation

\[ || \mathbf{y} - \mathbf{X}\mathbf{\beta} ||^{2} \]

The details of methods to calculate the Least Squares estimate of \(\beta\)’s is left to a course in mathematical statistics.