In this post, you will learn about Logistic Regression terminologies / glossary with quiz / practice questions. For machine learning Engineers or data scientists wanting to test their understanding of Logistic regression or preparing for interviews, these concepts and related quiz questions and answers will come handy. Here is a related post, 30 Logistic regression interview practice questions I have posted earlier. Here are some of the questions and answers discussed in this post:
- What are different names / terms used in place of Logistic regression?
- Define Logistic regression in simple words?
- Define logistic regression in terms of logit?
- Define logistic function?
- What does training a logistic regression model mean?
- What are different types of logistic regression models?
- What are different implementations of Logistic regression in Python Sklearn?
- What is regularization in Logistic regression and what are its different types?
- When to use which types of regularization in Logistic regression?
1. What are different names / terms used in place of Logistic regression?
The following are some of the different names / terms used:
- Logit regression
- Maximum entropy classification
- Log-linear classifier
2. Define Logistic Regression in simple words?
Logistic regression is an algorithm where the logarithm of odds of an event to occur(Class = 1 in case of binary classification) is directly proportional to linear combination of one or more parameters / features value and its coefficients. In other words, logistic regression is used to determine the relationship between the odds of an event happening (dependent variable) or logarithmic odds of event happening (dependent variable) and one or more independent variables. This is how it looks like, mathematically speaking:
[latex]
Log(\frac{P}{1-P}) =w_0*1 + w_1*x_1 + w_2*x_2 + w_3*x_3 + … + w_n*x_n
[/latex]
In the above mathematical equation, P denotes probability of whether an event will happen or not. [latex](\frac{P}{1-P})[/latex] denotes the odds of an event to occur. This can be read as the following:
For every 1 unit increase in value of [latex]x_n[/latex], log-odds of event happening increases by [latex]w_n[/latex] unit or odds of event happening increases by [latex]10^(w_n)[/latex]
It is called as logistic regression as the probability of an event occurring (can be labeled as 1) can be expressed as logistic function such as the following:
[latex]
P = \frac{1}{1 + e^-Z}
[/latex]
In above equation, Z can be represented as linear combination of independent variable and its coefficients. [latex]Z = \beta_0 + \beta_1*x_1 + … + \beta_n*x_n[/latex]
3. Define Logistic Regression in form of Logit?
Logistic regression is also called as logit regression because the dependent variable can also be termed as logit of the probability of event happening (Class = 1). The logit of probability is nothing but the logarithm of odds of event happening.
[latex]logit of probability (Y=1) = log(\frac{P(Y=1)}{1-P(Y=1)})[/latex]
The logistic function can be represented as inverse-logit.
4. Define Logistic Function
Logistic function is a sigmoid function which takes a real value as input and output the value between 0 and 1. Here is how the equation looks like:
[latex]
\sigma(z) = \frac{1}{1 + exp(-z)}
[/latex]
In the above equation, exp represents exponential (e). In case of logistic regression, Z represents the logit of probability of event happening or log-odds of an event happening and the [latex]\sigma(Z)[/latex] represents the probability of the event happening.
5. What does training a logistic regression model mean?
Training a logistic regression model means modeling the dependent random variable Y as 1 or 0 (in case of binary classification) given the independent variables. In other words, approximating a mathematical function which outputs probability of whether an event will happen as a function of independent variables. The goal is to find the coefficients of independent variable of the model.
The objective function used to estimate the parameters of independent variables is the likelihood function representing likelihood of an event happening given the data. The optimization is performed using techniques such as gradient descent by maximizing the log of likelihood function or minimizing the negative of log likelihood.
6. What are different types of logistic regression models?
The following are different types of logistic regression models:
- Binary classifier – Classifies the data belonging to one or the other class
- Multinomial classifier – Classifies the data belonging to more than two class using technique such as one-vs-rest (ovr)
- Ordinal classifier – Classifies the data belonging to different classes which are also in order.
7. What are different implementations of Logistic regression in Python Sklearn?
The following are different implementations of Logistic regression in Scikit-learn (Sklearn) in Python:
- LogisticRegression (sklearn.linear_model)
- LogisticRegressionCV (sklearn.linear_model)
8. What is regularization in Logistic regression and what are its different types?
Regularization in case of logistic regression is about regularizing the values of coefficients of different independent variables to achieve different objectives such as the following:
- Enhanced generalization performance: Reduce overfitting of the model thereby increasing the generalization performance of the model. This is achieved using L2 regularization.
- Features selection: Select most important features by nullifying the unimportant features (by making coefficients as 0) which adds to the complexity of the model. It, thus, reduces the overall model complexity thereby enhancing computational efficiency of the model. This is achieved using L1 regularization.
Different types of regularization supported in Logistic regression are as following:
- L1-norm regularization: In L1-regularization, the “absolute value of magnitude” of coefficient is added as penalty term to the loss function. The following is how the cost function / objective function looks like for L1-norm regularization
[latex]
\min_{w, c} \|w\|_1 + C \sum_{i=1}^n \log(\exp(- y_i (X_i^T w + c)) + 1).
[/latex]
- L2-norm regularization: In L2-regularization, the “square of magnitude” of coefficient is added as penalty term to the loss function. The following represents the cost function of L2-norm regularization:
[latex]
\min_{w, c} \frac{1}{2}w^T w + C \sum_{i=1}^n \log(\exp(- y_i (X_i^T w + c)) + 1) .
[/latex]
- Elasticnet regularization: The cost function of elasticnet regularization looks like the following. In the equation below, ρ controls the strength of ℓ1 regularization vs. ℓ2 regularization. Note that Elastic-Net regularization is equivalent to ℓ1-norm regularization when ρ=1 and equivalent to ℓ2-norm regularization when ρ=0.
[latex]
\min_{w, c} \frac{1 – \rho}{2}w^T w + \rho \|w\|_1 + C \sum_{i=1}^n \log(\exp(- y_i (X_i^T w + c)) + 1),
[/latex]
When to use which type of regularization in Logistic regression?
Here is the guideline on when to use which type of regularization:
- Use L1-regularization when the objective is feature selection
- Use L2-regularization when the objective is to reduce model overfitting.
References
- Agentic Reasoning Design Patterns in AI: Examples - October 18, 2024
- LLMs for Adaptive Learning & Personalized Education - October 8, 2024
- Sparse Mixture of Experts (MoE) Models: Examples - October 6, 2024
I found it very helpful. However the differences are not too understandable for me