# Neural Network Explained with Perceptron Example

Neural networks are an important part of machine learning, so it is essential to understand how they work. A neural network is a computer system that has been modeled based on a biological neural network comprising neurons connected with each other. It can be built to solve machine learning tasks, like classification and regression problems. The perceptron algorithm is a representation of how neural networks work. The artificial neurons were first proposed by Frank Rosenblatt in 1957 as models for the human brain’s perception mechanism. This post will explain the basics of neural networks with a perceptron example. You will understand how a neural network is built using perceptrons. This is a very important concept in relation to getting a good understanding of deep learning. You will also learn related Tensorflow / Keras code snippets.

Here are the key concepts related to how a deep neural network is built using one or more perceptrons:

• First and foremost, it is key to understand what is a Perceptron?
• A perceptron is the most fundamental unit which is used to build a neural network. A perceptron resembles a neuron in the human brain. In the case of a biological neuron, multiple input signals are fed into a neuron via dendrite, and an output signal is fired appropriately based on the strength of the input signal and some other mechanism. In the case of a perceptron, the input signals are combined with different weights and fed into the artificial neuron or perceptron along with a bias element. Representationally, within perceptron, the net sum is calculated as the sum of weights and input signal, and a bias element, then, the net sum is fed into a non-linear activation function. Based on the activation function, the output signal is sent out.  A perceptron can also be called a single-layer neural network as it is the part of the only layer in the neural network where the computation occurs. The computation occurs on the summation of input data that is fed into it. To understand greater details around perceptron, here is my post – Perceptron explained with Python example

The way that Rosenblatt Perceptron differed from McCulloch-Pitts Neuron (1943) is that in the case of Perceptron, the weights were learned based on the input data pertaining to different categories.

The perceptrons laid out in form of single-layer or multi-layer neural networks can be used to perform both regression and classification tasks. The diagram below represents the single-layer neural network (perceptron) that represents linear regression (left) and softmax regression (each output o1, o2, and o3 represents logits). Recall that the Softmax regression is a form of logistic regression that normalizes an input value into a vector of values that follows a probability distribution whose total sums up to 1. The output values are in the range [0,1]. This allows the neural network to predict as many classes or dimensions as required. This is why softmax regression is sometimes referred to as a multinomial logistic regression.

Here is a picture of a perceptron represented as the sum of the summation of inputs with weights (w1, w2, w3, wm) and bias element, that is passed through the activation function and the final output is obtained. This can be very well used for both regression and binary classification problems.

Fig. Perceptron

Here is an example of a multi-output perceptron. Note that perceptron is stacked and there are two outputs. Also, note that the perceptrons are fully connected, e.g., each output is connected with every input through perceptrons. The layers consisting of perceptrons that are connected to all inputs are called dense layers or fully connected layers. The dense layer is also called a stack of perceptrons. A configuration such as the one given below having multiple output neurons can be used to solve multi-class classification problems.

Fig. Multi-output perceptron

The above dense layer consisting of two units can be represented using the following TensorFlow code:

import tensorflow as tf
layer = tf.keras.layer.Dense(units = 2)

• What is a deep neural network (DNN) and how it is made up of perceptrons?
• A deep neural network (DNN) can be said to represent the mathematical model of the human brain. As the human brain is made up a large number of neurons, a deep neural network is also made up of a large number of neuron-like units called perceptrons. A deep neural network is called “deep” because multiple levels of composition are learned. The multiple levels of composition can also be understood as representations that are used for predictions. The perceptron in the deep neural network is laid out in form of four or more layers including one input layer, one output layer, and two or more hidden layers. Thus, a deep neural network comprises four or more layers with each layer consisting of one or more perceptrons. A neural network consisting of one input layer, one hidden layer, and one output layer is commonly known as a multi-layer perceptron (MLP) neural network.

This is how the input flows through the network. The input comprising of one or more inputs combined with weights and along with a bias element is fed into each perceptron of the first layer. For each of the perceptrons, the weights can be different. The output from the perceptrons in the first layer is combined with new weights and fed into each perceptron of the second layer along with a bias element and so on and so forth until the final layer is reached. Note that weights combined with input to different perceptrons can be different. Such layers where all perceptrons are connected with all inputs combined with different weights are also termed dense layers. And, the neural network with dense layers is called a fully connected neural network. Refer to the picture given below showing a neural network comprising one hidden layer and one output layer. Note that each layer will have its own set of weights. Also, z1, z2, z3, and z4 represent net input which is the sum of weights and inputs. In the neural network shown below, there are 4 perceptrons in the hidden layer and 2 perceptrons in the output layer. Each of the perceptrons will have an associated non-linear activation function represented using the letter g.  When there is two or more hidden layer, such neural networks are called a deep neural network. And, the learning is called deep learning. In the picture below, a multi-layer perceptron (MLP) neural network is shown.

Fig 1. Multi-layer perceptron (MLP) neural network

In the picture below, note how net input is calculated as a sum of input signals and weights.

Fig 2. Neural network – Net input explained

The above can be represented as a sequential neural network with two layers (hidden and output layer) in TensorFlow using the following code:
import tensorflow as tf

model = tf.keras.Sequential([
tf.keras.layers.Dense(units=n),
tf.keras.layers.Dense(units=2)
])

# Above code can also be written as the following:
model = tf.keras.Sequential([
tf.keras.layers.Dense(n),
tf.keras.layers.Dense(2)
])



In the above code, the first layer is a dense layer having n units, and the second layer, which is also the output layer, is a dense layer having two units.

In case, there are 5 dense layers with 4 hidden and 1 output layer with n1 unit in the first layer, n2 in the second layer, n3 in the third layer, n4 in the 4th layer, and 3 units in the last layer, the model code in TensorFlow would look like the following:

import tensorflow as tf

model = tf.keras.Sequential([
tf.keras.layers.Dense(n1),
tf.keras.layers.Dense(n2),
tf.keras.layers.Dense(n3),
tf.keras.layers.Dense(n4),
tf.keras.layers.Dense(3)
])


## Another example of Neural Network using Tensorflow / Keras

Here is a screenshot for a simple piece of code to train an artificial neural network which can be used to identify different class of images:

Fig 3. Understanding artificial neural networks using Tensorflow and Keras

There are five important concepts to learn in the above Tensorflow code. They are as follows:

• The blue color box represents the input image of size 150 x 150 with 3-bit color depth
• The brown color box represents the output which are three different classes that neural networks with classifying any image.
• The red color box represents 512 which, simply speaking, are 512 functions with each function having its own internal variables. The boxes in the diagram below represent these 512 functions. These 512 functions (also called neurons – see the diagram below) are fed with image pixels and trained based on the given label or output as 1 for the actual class. Machine learning is about identifying those internal variables (also called parameters) in different neurons (512 functions) to match the desired output (given label). The activation function used for each of the internal functions is ReLU.
• In the beginning of the training, these 512 functions’ parameters are initialized with some random values. How do these parameters’ values get adjusted during training purpose. This is where the orange box comes into the picture. The categorical_crossentropy represents the loss function that calculates loss during training. The optimizer, rmsprop adjusts the value of parameters to decrease the loss after each iteration of training.
• Finally, epochs (green box) represent the number of iterations for training to happen.

The above can also be understood with the following diagram: