Data Science

ROC Curve & AUC Explained with Python Examples

Last updated: 29th Dec, 2023

Confusion among data scientists regarding ROC Curve and AUC often stems from misunderstanding their relationship. The ROC Curve visualizes true positive vs false positive rates at various thresholds, while AUC quantifies the overall ability of a model to discriminate between classes, with higher values indicating better performance. In this post, you will learn about ROC Curve and AUC concepts along with related concepts such as True positive and false positive rate with the help of Python examples. It is very important to learn ROC, AUC and related concepts as it helps in selecting the most appropriate machine learning classification models based on the model performance. 

What is ROC Curve & AUC / AUROC?

Receiver operating characteristic (ROC) Curve are used for selecting the most appropriate classification models based on their performance with respect to the and true positive rate (TPR) also known as Recall or Sensitivity, and false positive rate (FPR) – the ratio of negative instances that are incorrectly classified as positive ones. False positive rate can also be represented as (1 – Specificity). These metrics are computed by shifting the decision threshold of the classifier. ROC curve is used for probabilistic models which predict the probabilities of the class. Here is a great paper to read and learn about ROC curve and AUC – A Relationship between the Average Precision and the Area Under the ROC Curve by Su, W., Yuan, Y., and Zhu, M

Let’s look at a sample ROC curve given below:

Fig 1. ROC Curve (Image credit: Wikimedia)

In the above ROC curve diagram, pay attention to some of the following:

  • Different ROC curves – Different models: There are different curves (red, green, blue) pertaining to different models. These models can be different owing to the fact that they could be trained using different hyper parameters.
  • Diagonal line – Random Guessing: The dashed red line drawn diagonally represents the random guessing (random classifier).
  • Better models: The model with ROC curve above red dashed line can be considered as better than random guessing and could be considered for adoption.
  • Worse models: The model with ROC curve below red dashed line can be considered as worse than random guessing and should be rejected.
  • Points on ROC Curve: The points on each ROC curve represent the decision thresholds and related values of true positive rate and false positive rate. Different thresholds result in different values of TPR and FPR and accordingly different overall model performance.
  • Ideal decision threshold: Idea decision threshold can be one which results in very high value of TPR (close to 1) and very low value of FPR (close to 0)
  • AUROC: AUROC represents Area Under ROC Curve (AUROC). The area under ROC curve is computed to characterise the performance of a classification model. Higher the AUC or AUROC, better the model is at predicting 0s as 0s and 1s as 1s.

Here is an example of different ROC Curves and AUCs. The ROC-AUC between 0.9 to 1.0 (Top and bottom left) is considered very good.

Let’s understand why ideal decision thresholds is about TPR close to 1 and FPR close to 0.

True Positive Rate (TPR) = True Positive (TP) / (TP + FN) = TP / Positives

False Positive Rate (FPR) = False Positive (FP) / (FP + TN) = FP / Negatives

Higher value of TPR would mean that the value of false negative is very low which would mean almost all positives are predicted correctly.

Lower value of FPR would mean that the value of false positive is very low which means almost all negatives are predicted correctly.

Going by the above, the decision threshold near top left of ROC curve would result in the model having optimal performance. In the above diagram, the point is represented using “Perfect Classifier”

ROC Curve and AUC have been found to be used as classification models evaluation metrics in several Kaggle competitions. Here is a list of some of them:

ROC & AUC Explained with Python Examples

In this section, you will learn to use roc_curve and auc method of sklearn.metrics. Sklearn breast cancer dataset is used for illustrating ROC curve and AUC. Pay attention to some of the following in the code given below.

  • Method roc_curve is used to obtain the true positive rate and false positive rate at different decision thresholds. Method roc_curve is passed the test labels, the probability of the class and the position of the positive class (pos_label).
  • Method auc is used to obtain the area under the ROC curve. Method AUC is passed false positive rate and true positive rate.
  • Three different ROC curves is drawn using different features.
  • Method predict_proba is used to get the value of probabilities of all the classes.
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
from sklearn.pipeline import make_pipeline
#
# Load the breast cancer data set
#
bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target
#
# Create training and test split
#
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=1, stratify=y)
#
# Create the estimator - pipeline
#
pipeline = make_pipeline(StandardScaler(), LogisticRegression(random_state=1))
#
# Create training test splits using two features
#
pipeline.fit(X_train[:,[2, 13]],y_train)
probs = pipeline.predict_proba(X_test[:,[2, 13]])
fpr1, tpr1, thresholds = roc_curve(y_test, probs[:, 1], pos_label=1)
roc_auc1 = auc(fpr1, tpr1)
#
# Create training test splits using two different features
#
pipeline.fit(X_train[:,[4, 14]],y_train)
probs2 = pipeline.predict_proba(X_test[:,[4, 14]])
fpr2, tpr2, thresholds = roc_curve(y_test, probs2[:, 1], pos_label=1)
roc_auc2 = auc(fpr2, tpr2)
#
# Create training test splits using all features
#
pipeline.fit(X_train,y_train)
probs3 = pipeline.predict_proba(X_test)
fpr3, tpr3, thresholds = roc_curve(y_test, probs3[:, 1], pos_label=1)
roc_auc3 = auc(fpr3, tpr3)

fig, ax = plt.subplots(figsize=(7.5, 7.5))

plt.plot(fpr1, tpr1, label='ROC Curve 1 (AUC = %0.2f)' % (roc_auc1))
plt.plot(fpr2, tpr2, label='ROC Curve 2 (AUC = %0.2f)' % (roc_auc2))
plt.plot(fpr3, tpr3, label='ROC Curve 3 (AUC = %0.2f)' % (roc_auc3))
plt.plot([0, 1], [0, 1], linestyle='--', color='red', label='Random Classifier')    
plt.plot([0, 0, 1], [0, 1, 1], linestyle=':', color='green', label='Perfect Classifier')
plt.xlim([-0.05, 1.05])
plt.ylim([-0.05, 1.05])
plt.xlabel('False positive rate')
plt.ylabel('True positive rate')
plt.legend(loc="lower right")
plt.show()

Here is how the ROC curve plot will look like. Pay attention to some of the following in the plot:

  • Red dashed line represents the random guessing
  • Black dashed line towards top left represents the best / perfect classifier
  • Other classifier have different AUC value and related ROC curve.
Fig 2. ROC Curve Plot

Conclusions

Here is what you learned in this post in relation to ROC curve and AUC:

  • ROC curve is used for probabilistic models which predicts the probability of one or more classes.
  • ROC curve is used to select the most appropriate models based on the model performance
  • ROC curve is a plot of true positive and false positive rate values which get determined based on different decision thresholds for a particular model.
  • Different ROC curves can be created based on different features, model hyper parameters etc.
  • AUC or AUROC is area under ROC curve. The value of AUC characterizes the model performance. Higher the AUC value, higher the performance of the model.
  • The perfect classifier will have high value of true positive rate and low value of false positive rate.
  • Any model with ROC curve above random guessing classifier line can be considered as a better model.
  • Any model with ROC curve below random guessing classifier line can outrightly be rejected.
Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com

Recent Posts

Pricing Analytics in Banking: Strategies, Examples

Last updated: 15th May, 2024 Have you ever wondered how your bank decides what to…

23 hours ago

How to Learn Effectively: A Holistic Approach

In this fast-changing world, the ability to learn effectively is more valuable than ever. Whether…

3 days ago

How to Choose Right Statistical Tests: Examples

Last updated: 13th May, 2024 Whether you are a researcher, data analyst, or data scientist,…

3 days ago

Data Lakehouses Fundamentals & Examples

Last updated: 12th May, 2024 Data lakehouses are a relatively new concept in the data…

4 days ago

Machine Learning Lifecycle: Data to Deployment Example

Last updated: 12th May 2024 In this blog, we get an overview of the machine…

4 days ago

Autoencoder vs Variational Autoencoder (VAE): Differences, Example

Last updated: 12th May, 2024 In the world of generative AI models, autoencoders (AE) and…

4 days ago