Last updated: 29th Nov, 2023
This page lists down the practice tests / interview questions and answers for Logistic regression in machine learning. Those wanting to test their machine learning knowledge in relation with logistic regression would find these practice tests very useful. The goal for these practice tests is to help you check your knowledge in logistic regression machine learning models from time-to-time. More importantly, when you are preparing for interviews, these practice tests are intended to be handy enough. Those going for freshers / intern interviews in the area of machine learning would also find these practice tests / interview questions to be very helpful.
These test primarily focus on following concepts related with logistic regression:
- Types of logistic regression (Binomial, Multinomial, Ordinal)
- Logistic function, logit transformation
- Evaluation of logistic regression (AIC, Deviance calculations)
- Classification problems examples where logistic regression can be applied
Logistic Regression Concepts
- Logistic Regression for Discrete Outcomes: Logistic regression is used to estimate/predict the discrete-valued output such as success or failure, 0 or 1, etc.
- Binary and Multinomial Classification: Logistic regression can be used for binary classification as well multinomial classification – classifying data in multiple classes.
- Softmax Classifier: In logistic regression, the term “softmax classifier” specifically refers to the multinomial logistic regression, where the softmax function is used to handle multiple classes. You may want to check out my post on What’s Softmax function and why do we need it?
- Gradient Descent and Cross-Entropy Loss: Logistic regression classifier is trained by applying gradient descent on the cross-entropy loss function. In other words, the weights of the logistic regression classifier are learned using gradient descent algorithm and cross-entropy loss function. You may want to check my post on Cross-entropy loss explained with Python examples.
- Cost Function and Log Loss: The cost function of logistic regression is derived from taking the log of the maximum likelihood function and applying negative to log loss function in order to use gradient descent for optimization purposes. This is why the cross-entropy loss function is also called a log loss function.
- Types of logistic regression model
- Binomial Logistic Regression: Used when the dependent variable has two possible outcomes, like ‘success’ or ‘failure’.
- Multinomial Logistic Regression: Applies when the outcome variable has more than two unordered categories, like ‘type of fruit’.
- Ordinal Logistic Regression: Suitable for dependent variables with ordered categories, such as ‘satisfaction rating’.
- Logistic regression model is evaluated using some of the following:
- AIC (Akaike Information Criterion): AIC is a statistical measure used to compare different statistical models. It balances model complexity against goodness of fit. Lower AIC values indicate a better model, considering both the likelihood of the model and the number of parameters used. It helps in model selection.
- Deviance (Null and Residual): In statistical modeling, deviance measures the difference between a fitted model and a saturated model. Null deviance shows this difference for a model with only the intercept, reflecting the model’s fit if no predictors are used. Residual deviance shows the difference after fitting the model with predictors.
- ROC Curve (Receiver Operating Characteristic Curve): The ROC curve is a graphical representation used in binary classification to assess the performance of a model. It plots the true positive rate against the false positive rate at various threshold settings. The area under the curve (AUC) indicates the model’s accuracy.
- Hosmer-Lemeshow Test: This test is used to evaluate the goodness of fit for logistic regression models. It compares observed event rates with predicted probabilities in subgroups of the dataset, providing a measure of how well the model predicts outcomes. A high p-value suggests a good fit.
- Pseudo R-squared: Pseudo R-squared values are used in logistic regression as counterparts to R-squared in linear regression. They provide an indication of the model’s explanatory power. Different versions (McFadden, Cox and Snell, etc.) measure this in slightly different ways, none of which can be interpreted as the proportion of variance explained as in linear regression.
In case you have not scored good enough, it may be good idea to go through basic machine learning concepts in relation with logistic regression. Following is the list of some my related blog pages: