**Ordinary least squares (OLS)** is a linear regression technique used to find the best-fitting line for a set of data points. It is a popular method because it is easy to use and produces decent results. In this blog post, we will discuss the basics of OLS and provide some examples to help you understand how it works. As data scientists, it is very important to learn the concepts of OLS before using it in the regression model.

Table of Contents

## What is the ordinary least squares (OLS) method?

The ordinary least squares (OLS) method can be defined as a linear regression technique that is used to estimate the unknown parameters in a model. The method relies on minimizing the sum of squared residuals between the actual and predicted values. The residual can be defined as the difference between the actual value and the predicted value. Another word for residual can be error. In mathematical terms, this can be written as:

**Minimize **∑(**yi** – **ŷ**i)**^2**

where **y**i is the actual value, **ŷ**i is the predicted value. A simple linear regression model used for determining the value of the response variable, **ŷ,** can be represented as the following equation. Note the method discussed in this blog can as well be applied to multivariate linear regression model.

**ŷ** = **mX** + **b**

where **ŷ** is the predicted value, **b** is the intercept, and **m** is the slope of the line. The coefficients **m** and **b** can also be called the **coefficients of determination**. The OLS method can be used to estimate the unknown parameters (**m** and **b**) by minimizing the sum of squared residuals. In other words, the OLS method finds the best-fit line for the data by minimizing the sum of squared errors or residuals between the actual and predicted values. The sum of squared residuals is also termed the sum of squared error (SSE).

This method is also known as the **least-squares method** for regression or linear regression.

## Minimizing the sum of squares residuals using the calculus method

The following represents the calculus method for minimizing the sum of squares residuals to find the unknown parameters for the model y = mx + b

Take the partial derivative of the cost function, sum of squared residuals, **∑(yi – ŷi)^2 **with respect to m:

**∂/∂m (SSE) = ∑-2Xi(yi – ŷi)**

Take the partial derivative of the cost function,** ∑ (yi – ŷi)^2** with respect to b:

**∂/∂b (SSE) = ∑-2(yi – ŷi)**

Set the partial derivatives equal to zero and solve for m and b:

**∑-2Xi(yi – ŷi) = 0**

**∑-(yi – ŷi) = 0**

This results in the following two equations:

**∑yi*xi = m∑xi*xi + b*∑xi**

**∑yi = m∑xi + b*n**

where n is the number of data points. These two equations can be solved simultaneously to find the values for m and b. Let’s say that the following three points are available such as (3, 7), (4, 9), (5, 12). And, the ask is to find the best fit line.

We will apply the calculus technique and use the above formulas. We will use the following formula:

**∑-2Xi(yi – ŷi) = 0**

The following calculation will happen:

-2[3(7 – (3m + b)) + 4(9 – (4m + b)) + 5(12 – (5m + b))] = 0

=> 3*7 + 4*9 + 5*12 – (9m + 3b + 16m + 4b + 25m + 5b) = 0

=> 21 + 36 + 60 – (50m + 12b) = 0

=> **116 = 50m + 12b …. eq (1)**

Let’s use another formula to find another equation:

**∑-(yi – ŷi) = 0**

The following calculation will happen:

7 – (3m + b) + 9 – (4m + b) + 12 – (5m + b) = 0

=> **28 = 12m + 3b … eq(2)**

The above two equations can be solved and the values of m and b can be found.

## Summary

The ordinary least squares (OLS) method is a linear regression technique that is used to estimate the unknown parameters in a model. The method relies on minimizing the sum of squared residuals between the actual and predicted values. The OLS method can be used to find the best-fit line for data by minimizing the sum of squared errors or residuals between the actual and predicted values. And, the calculus method for minimizing the sum of squares residuals is take the partial derivative of the cost function with respect to the coefficients of determination, set the partial derivatives equal to zero and solve for each of the coefficients. The OLS method is also known as least squares method for regression or linear regression.

- Pandas: How to Create a Dataframe – Examples - September 29, 2022
- Central Limit Theorem: Concepts & Examples - September 25, 2022
- Probability concepts, formulas & real-world examples - September 25, 2022

Why sum of squared residuals are taken?