Different Types of Distance Measures in Machine Learning

Euclidean Distance formula

In this post, you will learn different types of distance measures used in different machine learning algorithms such as K-nearest neighbours, K-means etc.

Distance measures are used to measure the similarity between two or more vectors in multi-dimensional space. The following represents different forms of distance metrics / measures:

  • Geometric distances
  • Computational distances
  • Statistical distances

Geometric Distance Measures

Geometric distance metrics, primarily, tends to measure the similarity between two or more vectors solely based on the distance between two points in multi-dimensional space. The examples of such type of geometric distance measures are Minkowski distance, Euclidean distance and Manhattan distance. One other different form of geometric distance is cosine similarity which will discuss in this section.

Minkowski distance is the general form of Euclidean and Manhattan distance. Mathematically, it can be represented as the following:

Fig 1. Minkowski Distance Formula

When the value of P becomes 1, it is called Manhattan distance. When P takes the value of 2, it becomes Euclidean distance. Let’s understand these distance measures with illustrative diagrams.

Euclidean Distance

The formula of Euclidean distance looks like the following. It is formed by assigning the value of P as 2 in Minkowski distance formula.

Fig 2. Euclidean Distance Formula

Let’s look at another illustrative example to understand Euclidean distance. Here it goes.

Fig 3. Euclidean Distance

Euclidean distance looks very similar to the Pythagorean theorem. It is the most intuitive type of distance measure which can be used to calculate distances between two different points. Let’s say you want to find the distance between two different points in your city, you can use Euclidean distance for same. Here is the diagram which represents the same.

Euclidean distance between two different points
Fig 3. Euclidean distance between two different points

Recall the definition of displacement in Physics. Displacement is defined as the shortest distance between two different, and, so is Euclidean distance.

Manhattan Distance

If you want to find Manhattan distance between two different points (x1, y1) and (x2, y2) such as the following, it would look like the following:

Manhattan distance = (x2 – x1) + (y2 – y1)

Diagrammatically, it would look like traversing the path from point A to point B while walking on the pink straight line.

Fig 4. Manhattan distance between two points A (x1, y1) and B (x2, y2)

Another diagram which illustrates the Manhattan distance is the following. It is the path traversed represented by the line with arrow.

Manhattan distance between two points A and B
Fig 5 Manhattan distance between two points A and B

Let’s look at the diagram given below to understand Euclidean, Manhattan and Minkowski distance. Note that different value of P such as 1 and 2 in Minkowski distance results in Euclidean and Manhattan distance respectively. The distance measures used to measure distance between two points in the city are represented with different value of P in Minkowski distance formula.

Fig 6. Minkowski, Euclidean and Manhattan Distance

Cosine Distance / Similarity

Cosine similarity is a measure of similarity between two non-zero vectors. It is calculated as an inner product of the two vectors that measures the cosine of the angle between them. The vectors which are most similar will have 0 degree between them. In other words, the most similar vectors will coincide with each other. The value of cos 0 is 1. The vector which will be opposite to each other or most dissimilar will have value as -1 (cos(180deg).

The advantage of using cosine distance is the sheer speed at calculating distances between sparse vectors. For instance, if there are 500 attributes collected about houses and 200 of these were mutually exclusive (meaning that one house had them but the others don’t), then there would only be need to include 300 dimensions in the calculation.

Here is the formula for cosine similarity between two vectors a and b having attributes in n dimensions

Cosine similarity between two vectors
Fig 7. Cosine similarity between two vectors

The diagram below represents cosine similarity between two vectors having different angle between them.

Cosine Similarity between two vectors
Fig 8. Cosine Similarity between two vectors

Statistical Distance Measures

Statistical distance measures is used to calculate distance between two statistical objects. The statistical distance measure is used in solving problems such as hypothesis testing, independence contrast, goodness of fit tests, classification tasks, outlier detection, density estimation methods etc. The following represents some of the important types of statistical distance measures:

  • Mahalanobis distance
  • Jaccard distance

Mahalanobis Distance

Mahalanobis distance is one type of statistical distance measure which is used to compute the distance from the point to the centre of a distribution. It is ideal to solve the outlier detection problem. The distance of a point P from probability distribution D is how far away standard deviation P is from the mean of probability distribution D. If the point P is at the mean of the probability distribution, the distance is zero (0).

Two points may seem to have same Euclidean distance but different Mahalanobis distance and hence are not similar. Look at the diagram given below.

Fig 9. Mahalanobis distance (Point X is found as outlier) – Source

You may note that point X can be seen as an outlier in the data distribution shown in the above diagram although the Euclidean distance of red and green points is same from the mean.

Jaccard Distance

One another type of statistical measure is Jaccard Distance. It takes into consideration the intersection or overlap of attributes between two objects between which the distance needs to be found. For example, lets say there are two points represented by a set of attributes. If one or more attributes for one points match another, then they would be overlapping and therefore close in distance. If the points had diverging attributes they wouldn’t match. The following diagram represents the Jaccard distance formula:

Statistical Distance Measure - Jaccard Distance
Fig 10. Statistical Distance Measure – Jaccard Distance
Ajitesh Kumar
Follow me

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com
Posted in Data Science, Machine Learning. Tagged with , .

Leave a Reply

Your email address will not be published. Required fields are marked *