7.3 Parameter Estimates
- Estimate the slope \(\beta_{1}\) and intercept \(\beta_{0}\) using a method called Least Squares.
- The residual mean squared error (RMSE) is an estimate of the variance \(s^{2}\)
- RMSE can also refer to the root mean squared error.
The Least Squares method finds the estimates for the intercept \(b_{0}\) and slope \(b_{1}\) that minimize the SSE (Sum of squared errors). Let’s explore that visually:
See https://paternogbc.shinyapps.io/SS_regression/
Initial Setup
- Set the sample size to 50
- Set the regression slope to 1
- Set the standard deviation to 5
The method of Least Squares finds the best estimates for \(\beta_{0}\) and \(\beta_{1}\) that minimized the sum of the squared residuals:
\[ \sum_{i=1}^{n} \epsilon_{i} \]
For simple linear regression the regression coefficient estimates that minimize the sum of squared errors can be calculated as:
\[ \hat{\beta_{0}} = \bar{y} - \hat{\beta_{1}}\bar{x} \quad \mbox{ and } \quad \hat{\beta_{1}} = r\frac{s_{y}}{s_{x}} \]
7.3.1 Sum of Squares
Partitioning the Variance using the Sum of Squares:
- SS Total- how far are the points away from \(\bar{y}\)? (one sample mean)
- SS Regression - how far away is the regression line from \(\bar{y}\)?.
- SS Error - how far are the points away from the estimated regression line?
Looking at it this way, we are asking “If I know the value of \(x\), how much better will I be at predicting \(y\) than if I were just to use \(\bar{y}\)?
This is the same partitioning of variance that is happens with ANOVA!
Here is a link to another interactive app where you can try to fit your own line to minimize the SSE.
RMSE is the Root Mean Squared Error. In the PMA textbook this is denoted as \(S\), which is an estimate for \(\sigma\).
\[ S = \sqrt{\frac{SSE}{N-2}}\]