Eigenvalues and eigenvectors are important concepts in linear algebra that have numerous applications in data science. They provide a way to analyze the structure of linear transformations and matrices, and are used extensively in many areas of machine learning, including feature extraction, dimensionality reduction, and clustering.
In simple terms, eigenvalues and eigenvectors are the building blocks of linear transformations. Eigenvalues represent the scaling factor by which a vector is transformed when a linear transformation is applied, while eigenvectors represent the directions in which the transformation occurs.
In this post, you will learn about why and when you need to use Eigenvalues and Eigenvectors? As a data scientist/machine learning Engineer, one must need to have a good understanding of concepts related to Eigenvalues and Eigenvectors as these concepts are used in one of the most popular dimensionality reduction techniques – Principal Component Analysis (PCA). In PCA, these concepts help in reducing the dimensionality of the data (curse of dimensionality) resulting in a simpler model which is computationally efficient and provides greater generalization accuracy. In this post, the following topics will be covered:
Before getting a little technical, let’s understand why we need Eigenvalues and Eigenvectors, in layman’s terms.
In order to understand a thing or problem in a better manner, we tend to break down things into smaller components and understand the things’ properties by understanding these smaller components. When we break down things into their most elementary components or basic elements, we can get a great understanding of the things. For example, if we want to understand a wooden table, we can understand it in a better manner by understanding the basic elements, wood, of which it is made. We can then understand the properties of wooden tables in a better manner.
Let’s take another example of an integer 18. In order to understand the integer, we can break down or decompose the integer into its prime factors such as 2 x 3 x 3. We get to know the properties of the integer 18 such as the following:
Similarly, Matrices can be broken down or decomposed in ways that can show information about their functional properties that are not obvious from the representation of the matrix as an array of elements. One of the most widely used kinds of matrix decomposition is called eigen-decomposition, in which we decompose a matrix into a set of eigenvectors and eigenvalues. You may want to check out this page from Deeplearning book by Ian Goodfellow, Yoshua Bengio, and Aaron C. By the way, this reasoning technique of breaking down things and arriving at the most basic elements of which thing is made to understand and innovate is also called first principles thinking.
In simple words, the concept of Eigenvectors and Eigenvalues are used to determine a set of important variables (in form of a vector) along with scale along different dimensions (key dimensions based on variance) for analyzing the data in a better manner. Let’s take a look at the following picture:
When you look at the above picture (data) and identify it as a tiger, what are some of the key information (dimensions / principal components) you use to call it out like a tiger? Is it not the face, body, legs, etc information? These principal components/dimensions can be seen as eigenvectors with each one of them having its own elements. For example, the body will have elements such as color, built, shape, etc. The face will have elements such as nose, eyes, color, etc. The overall data (image) can be seen as a transformation matrix. The data (transformation matrix) when acted on the eigenvectors (principal components) will result in the eigenvectors multiplied by the scale factor (eigenvalue). And, accordingly, you can identify the image as the tiger.
The solution to real-world problems often depends upon processing a large volume of data representing different variables or dimensions. For example, take the problem of predicting the stock prices. This is a machine learning / predictive analytics problem. Here the dependent value is stock price and there are a large number of independent variables on which the stock price depends. Using a large number of independent variables (also called features), training one or more machine learning models for predicting the stock price will be computationally intensive. Such models turn out to be complex models.
Can we use the information stored in these variables and extract a smaller set of variables (features) to train the models and do the prediction while ensuring that most of the information contained in the original variables is retained/maintained. This will result in simpler and computationally efficient models. This is where eigenvalues and eigenvectors come into the picture.
Feature extraction algorithms such as Principal component analysis (PCA) depend on the concepts of Eigenvalues and Eigenvectors to reduce the dimensionality of data (features) or compress the data (data compression) in form of principal components while retaining most of the original information. In PCA, the eigenvalues and eigenvectors of the features covariance matrix are found and further processed to determine top k eigenvectors based on the corresponding eigenvalues. Thereafter, the projection matrix is created from these eigenvectors which are further used to transform the original features into another feature subspace. With a smaller set of features, one or more computationally efficient models can be trained with reduced generalization error. Thus, it can be said that Eigenvalues and Eigenvectors concepts are key to training computationally efficient and high-performing machine learning models. Data scientists must understand these concepts very well.
Finding Eigenvalues and Eigenvectors of a matrix can be useful for solving problems in several fields such as some of the following wherever there is a need for transforming a large volume of multi-dimensional data into another subspace comprising of smaller dimensions while retaining most information stored in the original data. The primary goal is to achieve optimal computational efficiency.
Eigenvectors are the vectors that when multiplied by a matrix (linear combination or transformation) result in another vector having the same direction but scaled (hence scaler multiple) in forward or reverse direction by a magnitude of the scaler multiple which can be termed as Eigenvalue. In simpler words, the eigenvalues are scalar values that represent the scaling factor by which a vector is transformed when a linear transformation is applied. In other words, eigenvalues are the values that scale eigenvectors when a linear transformation is applied.
Here is the formula for what is called eigenequation.
[latex]
Ax = \lambda x
[/latex]
In the above equation, the matrix A acts on the vector x and the outcome is another vector Ax having the same direction as the original vector x but scaled/shrunk in forward or reverse direction by a magnitude of scaler multiple, [latex]\lambda[/latex]. The vector x is called an the eigenvector of A and [latex]\lambda[/latex] is called its eigenvalue. Let’s understand pictorially what happens when a matrix A acts on a vector x. Note that the new vector Ax has a different direction than vector x.
When the matrix multiplication with vector results in another vector in the same/opposite direction but scaled in forward / reverse direction by a magnitude of scaler multiple or eigenvalue ([latex]\lambda[/latex]), then the vector is called the eigenvector of that matrix. Here is the diagram representing the eigenvector x of matrix A because the vector Ax is in the same/opposite direction of x.
Here is further information on the value of eigenvalues:
There are several key properties of eigenvalues and eigenvectors that are important to understand. These include:
Many disciplines traditionally represent vectors as matrices with a single column rather than as matrices with a single row. For that reason, the word “eigenvector” in the context of matrices almost always refers to a right eigenvector, namely a column vector.
Here are the steps to calculate the eigenvalue and eigenvector of any matrix A.
For calculating the eigenvalues, one needs to solve the following equation:
[latex]
Ax = \lambda x
\\Ax – \lambda x = 0
\\(A – \lambda I)x = 0
[/latex]
For non-zero eigenvector, the eigenvalues can be determined by solving the following equation:
[latex]
A – \lambda I = 0
[/latex]
In above equation, I is identity matrix and [latex]\lambda[/latex] is eigenvalue. Once eigenvalues are determined, eigenvectors are determined by solving the equation [latex](A – \lambda I)x = 0[/latex]
Eigenvectors and eigenvalues are powerful tools that can be used in a variety of ways in machine learning. Here are some of the scenarios / reasons when you can use eigenvalues and eignevectors:
Here are some learnings from this post:
In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…
Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…
With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…
Anxiety is a common mental health condition that affects millions of people around the world.…
In machine learning, confounder features or variables can significantly affect the accuracy and validity of…
Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…
View Comments
I'm an old guy with almost no math background. But in my reading, 90% nonfiction, I run across the words "eigenvalue" and "eigenvector." Your post helped me understand these things.
A very good insight of Eigen values and Eigen Vector. You post is useful to many.
Thank you Krishna for the feedback.
I found your writing about eigenvector and eigenvalue is very helpful for my understanding!
Thank you
Thank you for the great explanation!
simple and great explaination
Very Nice Explaination. Thankyiu very much,