AI

When not to use F-Statistics for Multi-linear Regression

In this post, you will learn about the scenario in which you may NOT want to use F-Statistics for doing the hypothesis testing on whether there is a relationship between response and predictor variables in the multilinear regression model. Multilinear regression is a machine learning / statistical learning method which is used to predict the quantitative response variable and also understand/infer the relationship between the response and multiple predictor variables. We will look into the following topics:

  • Background
  • When not to use F-Statistics for Multilinear Regression Model

Background

F-statistics is used in hypothesis testing for determining whether there is a relationship between response and predictor variables in multilinear regression models. Let’s consider the following multilinear regression model:

[latex]Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + \beta_3X_3 + … + \beta_pX_p + \epsilon[/latex]

In the above equation, Y is the response variable, [latex]\beta_0, …, \beta_p[/latex] are coefficients and [latex]\epsilon[/latex] is the error term.

The null hypothesis can be stated as the following:

[latex]H_0: \beta_1 = \beta_2 = … = \beta_p = 0[/latex]

The alternate hypothesis can be stated as the following:

At least one of the coefficients, [latex]\beta_j[/latex] is not equal to zero

In order to reject or fail to reject the above mentioned null hypothesis, F-Statistics is used. The following represents the formula for F-Statistics:

F Value = [latex]\frac{\frac{(TSS – RSS)}{p}}{\frac{RSS}{N – P – 1}}[/latex]

In the above equation, TSS is total sum of squares [latex](Y – \bar{Y})^2[/latex], RSS is Residual sum of Squares [latex](Y – \hat{Y})^2[/latex], N is number of observations and P is number of parameters.

Based on the above, the value of F-statistics could be calculated and the related p-value could, then, be calculated. In case, the value of p-value is less than 0.05, one could reject the null hypothesis. This essentially means that there is a relationship between response and one or more predictor variables and the multilinear regression model holds good.

However, the question arises as to whether F-statistics could always be used?

When not to use F-Statistics for Multilinear Regression Model

The F-statistics could be used to establish the relationship between response and predictor variables in a multilinear regression model when the value of P (number of parameters) is relatively small, small enough compared to N.

However, when the number of parameters (features) is larger than N (the number of observations), it would be difficult to fit the regression model. Thus, F-statistics could not be used.

Summary

F-statistics could be used to perform hypothesis testing of whether there is a relationship between response and the predictor variables in a multilinear regression model. If the number of parameters (features) is smaller in comparison to the number of observations, one could go about using F-statistics to perform hypothesis testing. However, in case, the number of parameters is much larger than the number of observations, F-statistics could not be used as one won’t be able to fit a multilinear regression model in the first place.

 

 

 

Latest posts by Ajitesh Kumar (see all)
Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Recent Posts

What are AI Agents? How do they work?

Artificial Intelligence (AI) agents have started becoming an integral part of our lives. Imagine asking…

2 weeks ago

Agentic AI Design Patterns Examples

In the ever-evolving landscape of agentic AI workflows and applications, understanding and leveraging design patterns…

2 weeks ago

List of Agentic AI Resources, Papers, Courses

In this blog, I aim to provide a comprehensive list of valuable resources for learning…

2 weeks ago

Understanding FAR, FRR, and EER in Auth Systems

Have you ever wondered how systems determine whether to grant or deny access, and how…

3 weeks ago

Top 10 Gartner Technology Trends for 2025

What revolutionary technologies and industries will define the future of business in 2025? As we…

3 weeks ago

OpenAI GPT Models in 2024: What’s in it for Data Scientists

For data scientists and machine learning researchers, 2024 has been a landmark year in AI…

3 weeks ago