Find out about my work
By Marek Oerlemans
February 2025
Recently, I (re-)discovered the relationship between the C-index and the area under the receiver operator characteristic curve (AUC or AUROC for short). I’ve been working with binary classification data for quite some time, and calculating the AUC to measure model performance is quite standard. Not so long ago I started reading some papers doing survival analysis and I found the C-index. I had heard of it and its definition. The plain language version of the definition is that it’s the probability that the model score for a positive sample is greater than that of a negative sample (in a binary setting). This seems rather disconnected from the idea of an area under a curve made up of the true positive rate (TPR) on the y-axis and the false positive rate (FPR) on the x-axis.
Although this fact has been known for a very long time—there are published articles (here) and many Q&A or blog posts (e.g. here, here, here, and here) on the subject—I still had to wrap my head around how to work it out mathematically. In this post, I’ll try to explain and follow that thought process, and hopefully you’ll learn something from it. Along the way I’ll also explain various concepts of mathematics and statistics that I use.
To start proving that the two quantities are equivalent, we first have to define them mathematically. Assume that we have some model dataset with (N) samples: ({(x_i, y_i)}_{i=1}^N). We have built a prediction model (f) based on this set and define its output as (\hat{y}_i = f(x_i)). The output of the model could be in ([0,1]) (typical for a binary classifier), but we’ll keep it general and assume (\hat{y}_i \in (-\infty, \infty)).
As we discussed, the concordance index (C-index) is informally defined as the probability that the score of a positive sample is higher than the score of a negative sample. In mathematical notation:
\[\text{C-index} = \mathbb{P}(\hat{y}_i > \hat{y}_j \mid y_i > y_j) = \mathbb{P}(\hat{y}_i > \hat{y}_j \mid y_i = 1, y_j=0).\]To define the AUROC we first recall:
True Positive Rate (TPR):
(\text{TPR} = \frac{\text{TP}}{\text{TP} + \text{FN}} = \mathbb{P}(\hat{y}_i > t \,\mid\, y_i = 1).)
False Positive Rate (FPR):
(\text{FPR} = \frac{\text{FP}}{\text{TN} + \text{FP}} = \mathbb{P}(\hat{y}_i > t \,\mid\, y_i = 0).)
Both depend on the chosen threshold (t). The ROC curve is a plot of (FPR, TPR) pairs at different thresholds (t), as seen in the figure below.
The area under the ROC curve is abbreviated as AUC or AUROC, taking values in ([0.5,1]). A perfect classifier achieves 1.0, random guessing yields 0.5. Formally:
\[\text{AUC} = \int_0^1 \text{TPR}(\text{FPR}) d(\text{FPR}).\]Now to show their equivalence. Starting from:
\[\text{AUC} = \int_0^1 \text{TPR}\bigl(\text{FPR}\bigr) d\text{FPR}.\]We do a change of variable so that (u = \text{FPR}(t)). In more detail, we note:
Derivative of the FPR
\(\frac{d}{dt}\text{FPR}(t) = \frac{d}{dt}\mathbb{P}(\hat{y} > t \mid y=0) = p(t \mid y=0),\)
where (p(t \mid y=0)) is the probability density function of the model score for negative samples.
Hence, we can rewrite:
\[\text{AUC} = \int_{-\infty}^{\infty} \text{TPR}(t) p(t \mid y=0) dt.\]Recalling (\text{TPR}(t) = \mathbb{P}(\hat{y}>t \mid y=1)), we get:
\[\text{AUC} = \int_{-\infty}^{\infty} \Bigl[\mathbb{P}(\hat{y} > t \mid y=1)\Bigr] p(t \mid y=0) dt = \int_{-\infty}^{\infty} \int_{t}^{\infty} p(s \mid y=1) ds p(t \mid y=0) dt = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \mathbb{1}(s > t) p(t \mid y=0) p(s \mid y=1) dt ds.\]Where we used the indicator function (1(\cdot)), which equals one when the argument is satisfied and zero otherwise.
Recognizing this last integral is exactly:
\[\mathbb{P}(\hat{y}_i > \hat{y}_j \mid y_i = 1, y_j = 0),\]which is the definition of the C-index. Thus:
\[\text{AUC} = \text{C-index}.\]Although this exercise might be trivial for many, I found it useful to work out once. Hopefully you’ve learned something as well. Not only is it well known that the AUC and C-index are equivalent, but another well-known connection is to the Mann–Whitney U test, a nonparametric test to see if values from one group tend to be greater than values from another group. We’ll leave figuring this connection out as an exercise to the reader. Hint: use the definition of the c-index and write the U statistic as the sum of an indicator function over the positive and negative samples.
PS I’m still getting the hang of writing maths in markdown here on github pages. Sorry for weird formatting