9.8 General Advice

  • Model selection is not a hard science.
  • Some criteria have “rules of thumb” that can guide your exploration (such as differnce in AIC < 2)
  • Use common sense: A sub-optimal subset may make more sense than optimal one
  • p-values: When you compare two criteria, often the difference has a known distribution.
    • Wald F Test, the difference in RSS between the two models has a F distribution.
  • All criterion should be used as guides.
  • Perform multiple methods of variable selection, find the commonalities.
  • Let science and the purpose of your model be your ultimate guide
    • If the purpose of the model is for explanation/interpretation, error on the side of parsimony (smaller model) than being overly complex.
    • If the purpose is prediction, then as long as you’re not overfitting the model (as checked using cross-validation techniques), use as much information as possible.