9.2 Parameter Estimation
Recall the goal of regression analysis is to minimize the unexplained/residual error. That is, to minimize the difference between the value of the dependent variable predicted by the model and the true value of the dependent variable.
\[ \hat{y_{i}} - y_{i}, \]
where the predicted values \(\hat{y}_{i}\) are calculated as
\[\hat{y}_{i} = \sum_{i=1}^{p}X_{ij}\beta_{j}\]
The sum of the squared residual errors (the distance between the observed point \(y_{i}\) and the fitted value) now has the following form:
\[ \sum_{i=1}^{n} |y_{i} - \sum_{i=1}^{p}X_{ij}\beta_{j}|^{2}\]
Or in matrix notation
\[ || \mathbf{Y} - \mathbf{X}\mathbf{\beta} ||^{2} \]
Solving this least squares problem for multiple regression requires knowledge of multivariable calculus and linear algebra, and so is left to a course in mathematical statistics.