30+ Logistic Regression Interview Questions & Answers

machine learning interview questions

Last updated: 29th Nov, 2023

This page lists down the practice tests / interview questions and answers for Logistic regression in machine learning. Those wanting to test their machine learning knowledge in relation with logistic regression would find these practice tests very useful. The goal for these practice tests is to help you check your knowledge in logistic regression machine learning models from time-to-time. More importantly, when you are preparing for interviews, these practice tests are intended to be handy enough. Those going for freshers / intern interviews in the area of machine learning would also find these practice tests / interview questions to be very helpful.

These test primarily focus on following concepts related with logistic regression:

  • Types of logistic regression (Binomial, Multinomial, Ordinal)
  • Logistic function, logit transformation
  • Evaluation of logistic regression (AIC, Deviance calculations)
  • Classification problems examples where logistic regression can be applied

Table of Contents

Logistic Regression Concepts 

  • Logistic Regression for Discrete Outcomes: Logistic regression is used to estimate/predict the discrete-valued output such as success or failure, 0 or 1, etc.
  • Binary and Multinomial Classification: Logistic regression can be used for binary classification as well multinomial classification – classifying data in multiple classes.
  • Softmax Classifier: In logistic regression, the term “softmax classifier” specifically refers to the multinomial logistic regression, where the softmax function is used to handle multiple classes.  You may want to check out my post on What’s Softmax function and why do we need it?
  • Gradient Descent and Cross-Entropy Loss: Logistic regression classifier is trained by applying gradient descent on the cross-entropy loss function. In other words, the weights of the logistic regression classifier are learned using gradient descent algorithm and cross-entropy loss function. You may want to check my post on Cross-entropy loss explained with Python examples.
  • Cost Function and Log Loss: The cost function of logistic regression is derived from taking the log of the maximum likelihood function and applying negative to log loss function in order to use gradient descent for optimization purposes. This is why the cross-entropy loss function is also called a log loss function.
  • Types of logistic regression model 
    • Binomial Logistic Regression: Used when the dependent variable has two possible outcomes, like ‘success’ or ‘failure’.
    • Multinomial Logistic Regression: Applies when the outcome variable has more than two unordered categories, like ‘type of fruit’.
    • Ordinal Logistic Regression: Suitable for dependent variables with ordered categories, such as ‘satisfaction rating’.
  • Logistic regression model is evaluated using some of the following:
    • AIC (Akaike Information Criterion): AIC is a statistical measure used to compare different statistical models. It balances model complexity against goodness of fit. Lower AIC values indicate a better model, considering both the likelihood of the model and the number of parameters used. It helps in model selection.
    • Deviance (Null and Residual): In statistical modeling, deviance measures the difference between a fitted model and a saturated model. Null deviance shows this difference for a model with only the intercept, reflecting the model’s fit if no predictors are used. Residual deviance shows the difference after fitting the model with predictors.
    • ROC Curve (Receiver Operating Characteristic Curve): The ROC curve is a graphical representation used in binary classification to assess the performance of a model. It plots the true positive rate against the false positive rate at various threshold settings. The area under the curve (AUC) indicates the model’s accuracy.
    • Hosmer-Lemeshow Test: This test is used to evaluate the goodness of fit for logistic regression models. It compares observed event rates with predicted probabilities in subgroups of the dataset, providing a measure of how well the model predicts outcomes. A high p-value suggests a good fit.
    • Pseudo R-squared: Pseudo R-squared values are used in logistic regression as counterparts to R-squared in linear regression. They provide an indication of the model’s explanatory power. Different versions (McFadden, Cox and Snell, etc.) measure this in slightly different ways, none of which can be interpreted as the proportion of variance explained as in linear regression.

In case you have not scored good enough, it may be good idea to go through basic machine learning concepts in relation with logistic regression. Following is the list of some my related blog pages:

Practice Test / Interview Questions & Answers

#1. Which of the following is analogous to R-Squared for logistic regression?

#2. ______ the value of AUC, better is the prediction power of the model.

#3. Which of the following is used to identify the best threshold for separating positive and negative classes?

#4. Given two model with different AIC value, which one would be preferred model?

#5. In a logistic regression model, if the odds ratio for a predictor variable is less than 1, this suggests that:

#6. ______ value of deviance represents the better fit of model

#7. When using logistic regression for a binary classification problem, if the output of the model is 0.75 for a given observation, this means:

#8. ROC related with ROC curve stands for _______

#9. Regression coefficients in logistic regression are estimated using ________

#10. Which of the following metrics is equal to True Positive / (True positive + False Negative)

#11. Which of the following can be used to evaluate the performance of logistic regression model?

#12. In the context of logistic regression, what does the term “odds ratio” refer to?

#13. Whether a student will pass or fail in the competitive exam based on hours of study can be solved using _________ regression model

#14. Logistic regression is _________ when the observed outcome of dependent variable can have multiple possible types

#15. Which of the following can be used to evaluate the performance of logistic regression model?

#16. Deviance is is a function of _________

#17. Which of the following metrics is equal to True Positive / (True positive + False Positive)

#18. If the ROC curve of a model is closer to the top-left corner of the plot, it indicates:

#19. Logistic regression is _________ when the observed outcome of dependent variable can have only two values such as 0 and 1 or success and failure

#20. Estimation in logistic regression chooses the parameters that ___________ the likelihood of observing the sample values.

#21. Which of the following tests can be used to assess whether the logistic regression model is well calibrated?

#22. ________ regression can be termed as a special case of _________ regression when the outcome variable is categorical

#23. Which of the following is an application of logistic regression?

#24. Logistic Regression uses Softmax function for which of the following?

#25. Which of the following is link function in logistic regression?

#26. In the context of logistic regression, if a model has a deviance much larger than its degrees of freedom, this typically indicates:

#27. The odds of the dependent variable equaling a case (given some linear combination x of the predictors) is equivalent to _______

#28. Logistic regression is _________ when the observed outcome of dependent variable are ordered

#29. An AUC-ROC value of 0.5 indicates that the model's ability to discriminate between positive and negative classes is:

#30. How much marks a student can get in a competitive exam based on hours of study can be solved using _________ regression model

#31. ROC curve is a plot of __________ vs ___________

#32. If the model deviance is significantly ________ than the null deviance, then one can conclude that the predictor or set of predictors significantly improved model fit.

#33. Deviance can be shown to follow __________

#34. In the context of model comparison, if the difference in deviance between a more complex model and a simpler model is statistically significant, this suggests:

Finish

Results

Ajitesh Kumar
Follow me
Latest posts by Ajitesh Kumar (see all)
Ajitesh Kumar
Follow me

Latest posts by Ajitesh Kumar (see all)
Ajitesh Kumar
Follow me
Latest posts by Ajitesh Kumar (see all)

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com
Posted in Data Science, Interview questions, Machine Learning. Tagged with , , .

Leave a Reply

Your email address will not be published. Required fields are marked *