A feedforward neural network, also known as a multi-layer perceptron, is composed of layers of neurons that propagate information forward. In this post, you will learn about the concepts of feedforward neural network along with Python code example. We will start by discussing what a feedforward neural network is and why they are used. We will then walk through how to code a feedforward neural network in Python. In order to get good understanding on deep learning concepts, it is of utmost importance to learn the concepts behind feed forward neural network in a clear manner. Feed forward neural network learns the weights based on backpropagation algorithm which will be discussed in future posts.
Feed forward neural network represents the mechanism in which the input signals fed forward into a neural network, passes through different layers of the network in form of activations and finally results in form of some sort of predictions in the output layer. Here is an animation representing the feed forward neural network which classifies input signals into one of the three classes shown in the output.
Note some of the following aspects in the above animation in relation to how the input signals (variables) are fed forward through different layers of the neural network:
In feedforward neural network, the value that reaches to the new neuron is the sum of all input signals and related weights if it is first hidden layer, or, sum of activations and related weights in the neurons in the next layers.
For top-most neuron in the first hidden layer in the above animation, this will be the value which will be fed into the activation function. You may want to check out my other post on how to represent neural network as mathematical model.
[latex]weightedSum = \theta^{(1)}_{10}x_0 + \theta^{(1)}_{11}x_1 + \theta^{(1)}_{12}x_2 + \theta^{(1)}_{13}x_3 + \theta^{(1)}_{14}x_4[/latex]
.
For the top-most neuron in the second layer in the above animation, this will be the value of weighted sum which will be fed into the activation function:
[latex]weightedSum = \theta^{(2)}_{10}a^{(1)}_0 + \theta^{(2)}_{11}a^{(1)}_1 + \theta^{(2)}_{12}a^{(1)}_2 + \theta^{(2)}_{13}a^{(1)}_3 + \theta^{(2)}_{14}a^{(1)}_4 + \theta^{(2)}_{15}a^{(1)}_5 + \theta^{(2)}_{16}a^{(1)}_6[/latex]
.
Finally, this will be the output reaching to the first / top-most node in the output layer.
[latex]weightedSum = \theta^{(3)}_{10}a^{(2)}_0 + \theta^{(3)}_{11}a^{(2)}_1 + \theta^{(3)}_{12}a^{(2)}_2 + \theta^{(3)}_{13}a^{(2)}_3 + \theta^{(3)}_{14}a^{(2)}_4 + \theta^{(3)}_{15}a^{(2)}_5 + \theta^{(3)}_{16}a^{(2)}_6[/latex]
.
Based on the above formula, one could determine weighted sum reaching to every node / neuron in every layer which will then be fed into activation function.
The picture below (computational graph) represents how the value of objective function is calculated for a feedforward multi-layer perceptron neural network with one hidden layer. The diagram is taken from this page on dive into deep learning website.
Note some of the following in the above picture:
In this section, you will learn about how to represent the feed forward neural network using Python code.
As a first step, let’s create sample weights to be applied in the input layer, first hidden layer and the second hidden layer. Here is the code. Note that the weights for each layer is created as matrix of size M x N where M represents the number of neurons in the layer and N represents number of nodes / neurons in the next layer. Thus, the weight matrix applied to the input layer will be of size 4 X 6. Weights matrix applied to activations generated from first hidden layer is 6 X 6. Weights matrix applied to activations generated from second hidden layer is 6 X 4.
import numpy as np
from sklearn import datasets
#
# Generate a dataset and plot it
#
np.random.seed(0)
X, y = datasets.make_moons(200, noise=0.20)
#
# Neural network architecture
# No of nodes in input layer = 4
# No of nodes in output layer = 3
# No of nodes in the hidden layer = 6
#
input_dim = 4 # input layer dimensionality
output_dim = 3 # output layer dimensionality
hidden_dim = 6 # hidden layer dimensionality
#
# Weights and bias element for layer 1
# These weights are applied for calculating
# weighted sum arriving at neurons in 1st hidden layer
#
W1 = np.random.randn(input_dim, hidden_dim)
b1 = np.zeros((1, hidden_dim))
#
# Weights and bias element for layer 2
# These weights are applied for calculating
# weighted sum arriving at neurons in 2nd hidden layer
#
W2 = np.random.randn(hidden_dim, hidden_dim)
b2 = np.zeros((1, hidden_dim))
#
# Weights and bias element for layer 2
# These weights are applied for calculating
# weighted sum arriving at in the final / output layer
#
W3 = np.random.randn(hidden_dim, output_dim)
b3 = np.zeros((1, output_dim))
Let’s see the Python code for propagating input signal (variables value) through different layer to the output layer. Pay attention to some of the following:
#
# Forward propagation of input signals
# to 6 neurons in first hidden layer
# activation is calculated based tanh function
#
z1 = X.dot(W1) + b1
a1 = np.tanh(z1)
#
# Forward propagation of activation signals from first hidden layer
# to 6 neurons in second hidden layer
# activation is calculated based tanh function
#
z2 = a1.dot(W2) + b2
a2 = np.tanh(z2)
#
# Forward propagation of activation signals from second hidden layer
# to 3 neurons in output layer
#
z3 = a2.dot(W3) + b3
#
# Probability is calculated as an output
# of softmax function
#
probs = np.exp(z3) / np.sum(np.exp(z3), axis=1, keepdims=True)
Here is the summary of what you learned in this post in relation for feed forward neural network:
In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…
Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…
With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…
Anxiety is a common mental health condition that affects millions of people around the world.…
In machine learning, confounder features or variables can significantly affect the accuracy and validity of…
Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…
View Comments
Dear Mr Ajitesh Kumar :
I run the code that you gave. When in >>> z1 = X.dot(W1) + b1 I pressed Enter Key showing the error message like "ValueError: shapes (200,2) and (4,6) not aligned: 2 (dim 1) != 4 (dim 0) "
What is wrong ? why not (200,4) to match with (4,6) I do not know number 2 in shape(200,2) what it come from. Could you please help me to solve the problem. Thank you, Kuntoro Kuntoro Division of Biostatistics and Population Study Airlangga University School of Public Health Surabaya Indonesia