In this post, you will learn about how to train a neural network for multi-class classification using Python Keras libraries and Sklearn IRIS dataset. As a deep learning enthusiasts, it will be good to learn about how to use Keras for training a multi-class classification neural network.
The following topics are covered in this post:
- Keras neural network concepts for training multi-class classification model
- Python Keras code for fitting neural network using IRIS dataset
Keras Neural Network Concepts for training Multi-class Classification Model
Training a neural network for multi-class classification using Keras will require the following seven steps to be taken:
- Loading Sklearn IRIS dataset
- Prepare the dataset for training and testing by creating training and test split
- Setup a neural network architecture defining layers and associated activation functions
- Prepare the neural network
- Prepare the multi-class labels as one vs many categorical dataset
- Fit the neural network
- Evaluate the model accuracy with test dataset
Python Keras Code for Fitting Neural Network using IRIS Dataset
Here is the Python Keras code for training a neural network for multi-class classification of IRIS dataset. Pay attention to some of the following important aspects in the code given below:
- Loading Keras modules such as models and layers for creating an instance of sequential neural network, adding layers to the network
- Sequential neural network is created
- Different layers with activation function being added to the network. Note how the input_shape is set matching the number of features. For IRIS dataset, number of features is 4. Activation function used is ‘relu’. However, one can use other most commonly used activation functions such as ‘sigmoid’, ‘tanh’ etc.
- Output layer consist of softmax function for generating the probability associated with each class. Since there are three classes in IRIS dataset, the network adds output layer with three nodes.
- Neural network is compiled with three key components – optimizer function (rmsprop), loss function (cross entropy loss) and metrics (accuracy).
- You will need to define epoch and batch size for network.fit method.
- Training and test labels converted into one-vs-many class labels dataset using Keras utility to_categorical method.
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
#
# Import Keras modules
#
from keras import models
from keras import layers
from keras.utils import to_categorical
#
# Create the network
#
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(4,)))
network.add(layers.Dense(3, activation='softmax'))
#
# Compile the network
#
network.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
#
# Load the iris dataset
#
iris = datasets.load_iris()
X = iris.data
y = iris.target
#
# Create training and test split
#
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=42)
#
# Create categorical labels
#
train_labels = to_categorical(y_train)
test_labels = to_categorical(y_test)
#
# Fit the neural network
#
network.fit(X_train, train_labels, epochs=20, batch_size=40)
Once the network is fit, one can test the accuracy of network using the test data using the following code. Note the usage of the function evaluate.
#
# Get the accuracy of test data set
#
test_loss, test_acc = network.evaluate(X_test, test_labels)
#
# Print the test accuracy
#
print('Test Accuracy: ', test_acc, '\nTest Loss: ', test_loss)
Conclusions
Here is the summary of what you learned in relation to how to use Keras for training a multi-class classification model using neural network:
- Keras models and layers can be used to create a neural network instance and add layers to the network.
- You will need to define number of nodes for each layer and the activation functions. Different layers can have different number of nodes and different activation functions.
- Output layer must have the same number of nodes as like number of classes in case of multi-class classification models.
- Input layer must have same input_shape as like number of features.
- In case of multi-class classification, you can use softmax function as activation function.
- Agentic Reasoning Design Patterns in AI: Examples - October 18, 2024
- LLMs for Adaptive Learning & Personalized Education - October 8, 2024
- Sparse Mixture of Experts (MoE) Models: Examples - October 6, 2024
I found it very helpful. However the differences are not too understandable for me