11.5 Model Performance

  • Say we decide that a value of 0.15 is our optimal cutoff value to predict depression using this model.
  • We can use this probability to classify each row into groups.
    • The assigned class values must match the data type and levels of the true value.
    • It also has to be in the same order, so the 0 group needs to come first.
    • I want this matrix to show up like the one in Wikipedia, so I’m leveraging the forcats package to reverse my factor level ordering.
  • We can calculate a confusion matrix using the similarly named function from the caret package.
  • 123 people were correctly predicted to not be depressed (True Negative, \(n_{11}\))
  • 121 people were incorrectly predicted to be depressed (False Positive, \(n_{21}\))
  • 10 people were incorrectly predicted to not be depressed (False Negative, \(n_{12}\))
  • 40 people were correctly predicted to be depressed (True Positive, \(n_{22}\))

Other terminology:

  • Sensitivity/Recall/True positive rate: P(predicted positive | total positive) = 40/(10+40) = .8
  • Specificity/true negative rate: P(predicted negative | total negative) = 123/(123+121) = .504
  • Precision/positive predicted value: P(true positive | predicted positive) = 40/(121+40) = .2484
  • Accuracy: (TP + TN)/ Total: (40 + 123)/(40+123+121+10) = .5544
  • Balanced Accuracy: \([(n_{11}/n_{.1}) + (n_{22}/n_{.2})]/2\) - This is to adjust for class size imbalances (like in this example)
  • F1 score: the harmonic mean of precision and recall. This ranges from 0 (bad) to 1 (good): \(2*\frac{precision*recall}{precision + recall}\) = 2*(.2484*.8)/(.2484+.8) = .38