SelectFromModel for Feature Importance
In this post, you will learn about how to use Sklearn SelectFromModel class for reducing the training / test data set to the new dataset which consists of features having feature importance value greater than a specified threshold value. This method is very important when one is using Sklearn pipeline for creating different stages and Sklearn RandomForest implementation (such as RandomForestClassifier) for feature selection. You may refer to this post to check out how RandomForestClassifier can be used for feature importance. The SelectFromModel usage is illustrated using Python code example.
Here are the steps and related python code for using SelectFromModel.
Here is the python code representing the above steps:
from sklearn.feature_selection import SelectFromModel
#
# Fit the estimator; forest is the instance of RandomForestClassifier
#
sfm = SelectFromModel(forest, threshold=0.1, prefit=True)
#
# Transform the training data set
#
X_training_selected = sfm.transform(X_train)
#
# Count of features whose importance value is greater than the threshold value
#
importantFeaturesCount = X_selected.shape[1]
#
# Here are the important features
#
X_train.columns[sorted_indices][:X_selected.shape[1]]
The above may give output such as the following as the important features whose importance is greater than threshold value:
Index(['proline', 'flavanoids', 'color_intensity', 'od_dilutedwines', 'alcohal'], dtype='object')
Here is the visualization plot for important features:
Large language models (LLMs) have fundamentally transformed our digital landscape, powering everything from chatbots and…
As Large Language Models (LLMs) evolve into autonomous agents, understanding agentic workflow design patterns has…
In today's data-driven business landscape, organizations are constantly seeking ways to harness the power of…
In this blog, you would get to know the essential mathematical topics you need to…
This blog represents a list of questions you can ask when thinking like a product…
AI agents are autonomous systems combining three core components: a reasoning engine (powered by LLM),…