Ordinary Least Squares Method: Concepts & Examples

ordinary least squares method

Ordinary least squares (OLS) is a linear regression technique used to find the best-fitting line for a set of data points. It is a popular method because it is easy to use and produces decent results. In this blog post, we will discuss the basics of OLS and provide some examples to help you understand how it works. As data scientists, it is very important to learn the concepts of OLS before using it in the regression model.

What is the ordinary least squares (OLS) method?

The ordinary least squares (OLS) method can be defined as a linear regression technique that is used to estimate the unknown parameters in a model. The method relies on minimizing the sum of squared residuals between the actual and predicted values. The residual can be defined as the difference between the actual value and the predicted value. Another word for residual can be error. In mathematical terms, this can be written as:

Minimize ∑(yiŷi)^2

where yi is the actual value, ŷi is the predicted value. A simple linear regression model used for determining the value of the response variable, ŷ, can be represented as the following equation. Note the method discussed in this blog can as well be applied to multivariate linear regression model.

ŷ = mX + b

where ŷ is the predicted value, b is the intercept, and m is the slope of the line. The coefficients m and b can also be called the coefficients of determination. The OLS method can be used to estimate the unknown parameters (m and b) by minimizing the sum of squared residuals. In other words, the OLS method finds the best-fit line for the data by minimizing the sum of squared errors or residuals between the actual and predicted values. The sum of squared residuals is also termed the sum of squared error (SSE).

This method is also known as the least-squares method for regression or linear regression.

Minimizing the sum of squares residuals using the calculus method

The following represents the calculus method for minimizing the sum of squares residuals to find the unknown parameters for the model y = mx + b

Take the partial derivative of the cost function, sum of squared residuals, ∑(yi – ŷi)^2 with respect to m:

∂/∂m (SSE) = ∑-2Xi(yi – ŷi)

Take the partial derivative of the cost function, ∑ (yi – ŷi)^2 with respect to b:

∂/∂b (SSE) = ∑-2(yi – ŷi)

Set the partial derivatives equal to zero and solve for m and b:

∑-2Xi(yi – ŷi) = 0

∑-(yi – ŷi) = 0

This results in the following two equations:

∑yi*xi = m∑xi*xi + b*∑xi

∑yi = m∑xi + b*n

where n is the number of data points. These two equations can be solved simultaneously to find the values for m and b. Let’s say that the following three points are available such as (3, 7), (4, 9), (5, 12). And, the ask is to find the best fit line.

We will apply the calculus technique and use the above formulas. We will use the following formula:

∑-2Xi(yi – ŷi) = 0

The following calculation will happen:

-2[3(7 – (3m + b)) + 4(9 – (4m + b)) + 5(12 – (5m + b))] = 0

=> 3*7 + 4*9 + 5*12 – (9m + 3b + 16m + 4b + 25m + 5b) = 0

=> 21 + 36 + 60 – (50m + 12b) = 0

=> 116 = 50m + 12b …. eq (1)

Let’s use another formula to find another equation:

∑-(yi – ŷi) = 0

The following calculation will happen:

7 – (3m + b) + 9 – (4m + b) + 12 – (5m + b) = 0

=> 28 = 12m + 3b … eq(2)

The above two equations can be solved and the values of m and b can be found.

Summary

The ordinary least squares (OLS) method is a linear regression technique that is used to estimate the unknown parameters in a model. The method relies on minimizing the sum of squared residuals between the actual and predicted values. The OLS method can be used to find the best-fit line for data by minimizing the sum of squared errors or residuals between the actual and predicted values. And, the calculus method for minimizing the sum of squares residuals is take the partial derivative of the cost function with respect to the coefficients of determination, set the partial derivatives equal to zero and solve for each of the coefficients. The OLS method is also known as least squares method for regression or linear regression.

Ajitesh Kumar
Follow me

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking
Posted in Data Science, Machine Learning. Tagged with , .

One Response

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload the CAPTCHA.