Last updated: 18th August, 2024
As data scientists, we navigate a sea of metrics to evaluate the performance of our regression models. Understanding these metrics – Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and R-Squared – is crucial for robust model evaluation and selection. In this blog, we delve into the intricacies of these different metrics while learning them based on clear definitions, formulas, and guidance on when to use which of these metrics.
The following are different types of regression model evaluation metrics including MSE, RMSE, MAE, MAPE, R-squared, and Adjusted R-squared which get used in different scenarios when training the regression models to solve the desired problem in hand. Each metric provides a different lens to evaluate the performance of a regression model. Choosing the right metric depends on the specific context and objectives of our analysis. Understanding these metrics intuitively helps in selecting the most appropriate model and communicating its performance effectively.
MSE is a cost function that calculates the average of the squares of the errors—i.e., the average squared difference between the estimated values and the actual value. Suppose we have a regression model that predicts the house prices. MSE would measure the average squared difference between the actual price and the model’s predicted prices. For example, if the model predicts a house to be $300,000 and it’s $320,000, the squared error is square of $(300,000−320,000) = $400000. MSE does this for all predictions and averages them. It emphasizes larger errors, which could be crucial in scenarios like financial forecasting where large errors are more detrimental.
Formula: $MSE = \frac{1}{n} \sum_{i=1}^{n} (Y_i – \hat{Y}_i)^2$
In case of data having outliers, you can use evaluation metrics such as MAE. This is because error magnifies in case of outliers.
RMSE is a cost function that can be represented as the square root of the mean square error, bringing the scale of the errors to be the same as the scale of targets. In the context of the house pricing example, RMSE brings the error metric back to the price scale. This makes it easier to understand the average error in terms of the actual values. If RMSE is $20,000, it means the typical prediction error is about $20,000.
Choosing Root Mean Squared Error (RMSE) over Mean Squared Error (MSE) can be advantageous for several reasons, particularly in the context of practical application and interpretability.
Formula: $RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (Y_i – \hat{Y}_i)^2}$
Here are some of the Kaggle competitions which used RMSE as the evaluation metrics:
As with the case of MSR, RMSE can be avoided when data has outliers. Larger errors due to outliers get punished. You might use MAE as the model performance metrics.
MAE measures the average magnitude of the errors in a set of predictions, without considering their direction. It is the average absolute difference between the predicted and actual values. Unlike MSE, it doesn’t square the errors, which means it doesn’t punish larger errors as harshly. In our house pricing example, if you’re off by $20,000 or $40,000, MAE treats these errors linearly. This metric is particularly useful when you want to avoid giving extra penalty to large errors.
The Mean Absolute Error (MAE) offers distinct advantages over Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) in certain situations.
Formula: $MAE = \frac{1}{n} \sum_{i=1}^{n} |Y_i – \hat{Y}_i|$
Here are a couple of examples of Kaggle competitions which used MAE as the evaluation metrics:
MAPE expresses the error as a percentage of the actual values, providing an easy-to-understand metric. For instance, if a house is worth $200,000 and you predict $180,000, the error is 10%. This percentage-based approach makes MAPE very interpretable, especially when explaining model performance to stakeholders who might not be technical.
Mean Absolute Percentage Error (MAPE) offers unique advantages over Mean Absolute Error (MAE) and Mean Squared Error (MSE) / Root Mean Squared Error (RMSE) in certain scenarios. Its distinctive features make it a preferred choice in specific contexts:
Formula: $MAPE = \frac{100%}{n} \sum_{i=1}^{n} \left|\frac{Y_i – \hat{Y}_i}{Y_i}\right|$
R-Squared indicates the proportion of the variance in the dependent variable that is predictable from the independent variables. R-Squared shows how well your predictions approximate the real data points. It’s like grading a test out of 100%. A high R-Squared (close to 1) means your model can very closely predict the actual values. For instance, in predicting house prices, a high R-Squared would indicate that your model captures most of the variability in house prices.
Using R-Squared over other metrics like MAE, MSE, RMSE, or MAPE has distinct advantages in specific contexts:
When using R-squared, we also come across another related metrics called as adjusted R-Squared. It is an essential statistical measure, especially in the context of multiple regression models. While R-Squared indicates the proportion of variance in the dependent variable that can be explained by the independent variables, it (R-squared) has a significant limitation: it tends to increase as more predictors are added to the model, regardless of whether those predictors actually improve the model. This is where Adjusted R-Squared becomes invaluable. It modifies the R-Squared formula to account for the number of predictors in the model. Unlike R-Squared, Adjusted R-Squared increases only if the new predictor improves the model more than what would be expected by chance and can decrease if the predictor doesn’t improve the model sufficiently. This makes Adjusted R-Squared a more reliable metric, particularly when comparing models with a different number of predictors. It penalizes the model for adding predictors that do not contribute to its predictive power, thus providing a more accurate reflection of the model’s ability to explain the variance in the dependent variable.
Based on the discussion in the previous section, the following is a list of key differences between these evaluation metrics:
Metrics | What? | Why? | When to Use? |
---|---|---|---|
MSE | Measures average squared difference between estimated and actual values. | Emphasizes larger errors. | When large errors are more critical. |
RMSE | Square root of MSE, in same units as response variable. | Easier interpretation of errors. | When error scale should match target scale. |
MAE | Average absolute difference between estimated and actual values. | Less sensitive to outliers. | With many outliers or non-normal residuals. |
MAPE | Percentage error between estimated and actual values. | Easy interpretation as a percentage. | For forecasting and percentage-based error analysis. |
R-Squared | Proportion of variance explained by the model. | Indicates model’s explanatory power. | To evaluate linear regression models’ fit. |
Adjusted R-squared | Statistical measure that modifies the R-Squared value to account for the number of predictors | Unlike R-squared, it penalizes the model for including irrelevant predictors | Useful in multiple regression scenarios where you have several independent variables |
The following is the Python and R code for calculating these metrics such as MSE / RMSE, MAE, MAPE, R-Squared, Adjusted R-Squared for evaluating regression models.
The following is the Python code example:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score # Assuming y_true and y_pred are the true and predicted values # MSE mse = mean_squared_error(y_true, y_pred) # RMSE rmse = mean_squared_error(y_true, y_pred, squared=False) # MAE mae = mean_absolute_error(y_true, y_pred) # R-Squared r_squared = r2_score(y_true, y_pred) # Custom method for calculating MAPE def calculate_mape(y_true, y_pred): y_true, y_pred = np.array(y_true), np.array(y_pred) non_zero_mask = y_true != 0 return np.mean(np.abs((y_true[non_zero_mask] - y_pred[non_zero_mask]) / y_true[non_zero_mask])) * 100 # Example usage # y_true = [actual values] # y_pred = [predicted values] # mape = calculate_mape(y_true, y_pred)
The following is the R code example:
# Assuming y_true and y_pred are the true and predicted values # MSE mse <- mean((y_pred - y_true)^2) # RMSE rmse <- sqrt(mse) # MAE mae <- mean(abs(y_pred - y_true)) # R-Squared r_squared <- summary(lm(y_true ~ y_pred))$r.squared # Custom method for calculating MAPE calculate_mape <- function(y_true, y_pred) { non_zero_indices <- which(y_true != 0) if (length(non_zero_indices) > 0) { mean(abs((y_true[non_zero_indices] - y_pred[non_zero_indices]) / y_true[non_zero_indices])) * 100 } else { NA } } # Example usage # y_true <- c(actual values) # y_pred <- c(predicted values) # mape <- calculate_mape(y_true, y_pred)
By understanding these metrics, you as data scientists can choose the most appropriate one for specific context. Remember, no single metric is the “best” in all situations; it depends on the specific objectives and nature of your data. This insight into model evaluation will empower your data science journey, leading to more accurate and reliable predictive regression models.
Artificial Intelligence (AI) agents have started becoming an integral part of our lives. Imagine asking…
In the ever-evolving landscape of agentic AI workflows and applications, understanding and leveraging design patterns…
In this blog, I aim to provide a comprehensive list of valuable resources for learning…
Have you ever wondered how systems determine whether to grant or deny access, and how…
What revolutionary technologies and industries will define the future of business in 2025? As we…
For data scientists and machine learning researchers, 2024 has been a landmark year in AI…