7.5 Correlation Coefficient

  • The correlation coefficient \(\rho\) measures the strength of association between \(X\) and \(Y\) in the population.
  • \(\sigma^{2} = VAR(Y|X)\) is the variance of \(Y\) for a specific \(X\).
  • \(\sigma_{y}^{2} = VAR(Y)\) is the variance of \(Y\) for all \(X\)’s.

\[ \sigma^{2} = \sigma_{y}^{2}(1-\rho^{2})\] \[ \rho^{2} = \frac{\sigma_{y}^{2} - \sigma^{2}}{\sigma_{y}^{2}}\]

  • \(\rho^{2}\) = reduction in variance of Y associated with knowledge of X/original variance of Y
  • Coefficient of Determination: \(100\rho^{2}\) = % of variance of Y associated with X or explained by X
  • Caution: association vs. causation.