12.6 ROC Curves
- ROC curves show the balance between sensitivity and specificity.
- We’ll use the [ROCR] package. It only takes 3 commands:
- calculate
prediction()
using the model - calculate the model
performance()
on both true positive rate and true negative rate for a whole range of cutoff values. plot
the curve.- The
colorize
option colors the curve according to the probability cutoff point.
- The
- calculate
<- prediction(phat.depr, dep_sex_model$y)
pr <- performance(pr, measure="tpr", x.measure="fpr")
perf plot(perf, colorize=TRUE, lwd=3, print.cutoffs.at=c(seq(0,1,by=0.1)))
abline(a=0, b=1, lty=2)
We can also use the performance()
function to evaluate the \(f1\) measure
<- performance(pr,measure="f")
perf.f1 <- performance(pr,measure="acc")
perf.acc
par(mfrow=c(1,2))
plot(perf.f1)
plot(perf.acc)
We can dig into the perf.acc
object to get the maximum accuracy value (y.value
), then find the row where that value occurs, and link it to the corresponding cutoff value of x.
<- max(perf.acc@y.values[[1]], na.rm=TRUE))
(max.f1 ## [1] 0.8333333
<- which(perf.acc@y.values[[1]]==max.f1))
(row.with.max ## [1] 2 8
<- perf.acc@x.values[[1]][row.with.max])
(cutoff.value ## 124 256
## 0.4508171 0.3946273
Sometimes (like here) there is not one single maximum value for accuracy. In this case I would look at which of these two cutoff points maximize other metrics such as the \(f1\) score.
ROC curves:
- Can also be used for model comparison: http://yaojenkuo.io/diamondsROC.html
- The Area under the Curve (auc) also gives you a measure of overall model accuracy.
<- performance(pr, measure='auc')
auc @y.values
auc## [[1]]
## [1] 0.695041