# Category Archives: Data Science

## Poisson Distribution Explained with Python Examples

In this post, you will learn about the concepts of Poisson probability distribution with Python examples. As a data scientist, you must get a good understanding of the concepts of probability distributions including normal, binomial, Poisson etc. Poisson distribution is the discrete probability distribution which represents the probability of occurrence of an event r number of times in a given interval of time or space if these events occur with a known constant mean rate and independent of each other. The following is the key criteria that the random variable follows the Poisson distribution. Individual events occur at random and independently in a given interval. This can be an interval of time or …

## Geometric Distribution Explained with Python Examples

In this post, you will learn about the concepts of Geometric probability distribution with the help of real-world examples and Python code examples. It is of utmost importance for data scientists to understand and get an intuition of different kinds of probability distribution including geometric distribution. You may want to check out some of my following posts on other probability distribution. Normal distribution explained with Python examples Binomial distribution explained with 10+ examples Hypergeometric distribution explained with 10+ examples In this post, the following topics have been covered: Geometric probability distribution concepts Geometric distribution python examples Geometric distribution real-world examples Geometric Probability Distribution Concepts Geometric probability distribution is a discrete …

## Top 10 Analytics Strategies for Great Data Products

In this post, you will learn about the top 10 data analytics strategies which will help you create successful data products. These strategies will be helpful in case you are setting up a data analytics practice or center of excellence (COE). As an AI / Machine Learning / Data Science stakeholders, it will be important to understand these strategies in order to deliver analytics solution which creates business value having positive business impact. Here are the top 10 data analytics strategies: Identify top 2-3 business problems Identify related business / engineering organizations Create measurement plan by identifying right KPIs Identify analytics deliverables such as analytics reports, predictions etc Gather data …

## Keras CNN Image Classification Example

In this post, you will learn about how to train a Keras Convolution Neural Network (CNN) for image classification. Before going ahead and looking at the Python / Keras code examples and related concepts, you may want to check my post on Convolution Neural Network – Simply Explained in order to get a good understanding of CNN concepts. Keras CNN Image Classification Code Example First and foremost, we will need to get the image data for training the model. In this post, Keras CNN used for image classification uses the Kaggle Fashion MNIST dataset. Fashion-MNIST is a dataset of Zalando’s article images—consisting of a training set of 60,000 examples and a …

## Data Quality Challenges for Machine Learning Models

In this post, you will learn about some of the key data quality challenges which need to be dealt with in a consistent and sustained manner to ensure high quality machine learning models. Note that high quality models can be termed as models which generalizes better (lower true error with predictions) with unseen data or data derived from larger population. As a data science architect or quality assurance (QA) professional dealing with quality of machine learning models, you must learn some of these challenges and plan appropriate development processes to deal with these challenges. Here are some of the key data quality challenges which need to be tackled appropriately in …

## Data Quality Assessment Frameworks – Machine Learning

In this post, you will learn about data quality assessment frameworks / techniques in relation to machine learning and why one needs to assess data quality for building high-performance machine learning models? As a data science architect or development manager, you must get a sense of the importance of data quality in relation to building high-performance machine learning models. The idea is to understand what is the value of data set. The goal is to determine whether the value of data can be quantised. This is because it is important to understand whether the data contains rich information which could be valuable for building models and inform stakeholders on data …

## Keras Neural Network for Regression Problem

In this post, you will learn about how to train neural network for regression machine learning problems using Python Keras. Regression problems are those which are related to predicting numerical continuous value based on input parameters / features. You may want to check out some of the following posts in relation to how to use Keras to train neural network for classification problems: Keras – How to train neural network to solve multi-class classification Keras – How to use learning curve to select most optimal neural network configuration for training classification model In this post, the following topics are covered: Design Keras neural network architecture for regression Keras neural network …

## Keras – Categorical Cross Entropy Loss Function

In this post, you will learn about when to use categorical cross entropy loss function when training neural network using Python Keras. Generally speaking, the loss function is used to compute the quantity that the the model should seek to minimize during training. For regression models, the commonly used loss function used is mean squared error function while for classification models predicting the probability, the loss function most commonly used is cross entropy. In this post, you will learn about different types of cross entropy loss function which is used to train the Keras neural network model. Cross entropy loss function is an optimization function which is used in case …

## Python Keras – Learning Curve for Classification Model

In this post, you will learn about how to train an optimal neural network using Learning Curves and Python Keras. As a data scientist, it is good to understand the concepts of learning curve vis-a-vis neural network classification model to select the most optimal configuration of neural network for training high-performance neural network. In this post, the following topics have been covered: Concepts related to training a classification model using a neural network Python Keras code for creating the most optimal neural network using a learning curve Training a Classification Neural Network Model using Keras Here are some of the key aspects of training a neural network classification model using Keras: …

## Keras Multi-class Classification using IRIS Dataset

In this post, you will learn about how to train a neural network for multi-class classification using Python Keras libraries and Sklearn IRIS dataset. As a deep learning enthusiasts, it will be good to learn about how to use Keras for training a multi-class classification neural network. The following topics are covered in this post: Keras neural network concepts for training multi-class classification model Python Keras code for fitting neural network using IRIS dataset Keras Neural Network Concepts for training Multi-class Classification Model Training a neural network for multi-class classification using Keras will require the following seven steps to be taken: Loading Sklearn IRIS dataset Prepare the dataset for training and testing …

## Neural Network Back-Propagation Python Examples

In this post, you will learn about the concepts of neural network back propagation algorithm along with Python examples. As a data scientist, it is very important to learn the concepts of back propagation algorithm if you want to get good at deep learning models. This is because back propagation algorithm is key to learning weights at different layers in the deep neural network. What’s Back Propagation Algorithm? The backpropagation algorithm represents the propagation of the gradients of outputs from each node (in each layer) on the final output, in the backward direction right up to the input layer nodes. All that is achieved using the backpropagation algorithm is to …

## Data Storytelling Explained with Examples

In this post, you will learn about some of the key concepts in relation to data storytelling and why data scientists / data analyst should acquire this skill. Data storytelling is one of the key skills which data scientists would need to acquire in order to do a great job in representing the data with story. Most of the time, it has been seen that data scientists merely present multiple plots with the sole aim of showing the logic and reasoning. However, it is equally important to represent the data as story as it results in an emotional connect with stakeholders and help them make the decisions. Thus, data scientists …

## Feed Forward Neural Network Python Example

In this post, you will learn about the concepts of feed forward neural network along with Python code example. In order to get good understanding on deep learning concepts, it is of utmost importance to learn the concepts behind feed forward neural network in a clear manner. Feed forward neural network learns the weights based on back propagation algorithm which will be discussed in future posts. In this post, the following topics are covered: What’s feed forward neural network? Feed forward neural network Python example What’s Feed Forward Neural Network? Feed forward neural network represents the mechanism in which the input signals fed forward into a neural network, passes through different layers of the …

## What’s Softmax Function & Why do we need it?

In this post, you will learn about the concepts of Softmax function with Python code example and why do we need Softmax function? As a data scientist / machine learning enthusiasts, it is very important to understand the concepts of Softmax function as it helps in understanding the algorithms such as neural network, multinomial logistic regression in better manner. Note that Softmax function is used in various multiclass classification machine learning algorithms such as multinomial logistic regression (thus, also called as softmax regression), neural networks etc. What’s Softmax Function? Simply speaking, Softmax function converts raw values (as outcome of functions) into probabilities. Here is how the softmax function looks like: …

## Cross Entropy Loss Explained with Python Examples

In this post, you will learn the concepts related to cross-entropy loss function along with Python and which machine learning algorithms use cross entropy loss function as an optimization function. Cross entropy loss is used as a loss function for models which predict the probability value as output (probability distribution as output). Logistic regression is one such algorithm whose output is probability distribution. In this post, the following topics are covered: What’s cross entropy loss? Cross entropy loss explained with Python examples What’s Cross Entropy Loss? Cross entropy loss function is an optimization function which is used for training machine learning classification models which classifies the data by predicting the …

## Python Sklearn – How to Generate Random Datasets

In this post, you will learn about some useful random datasets generators provided by Python Sklearn. There are many methods provided as part of Sklearn.datasets package. In this post, we will take the most common ones such as some of the following which could be used for creating data sets for doing proof-of-concepts solution for regression, classification and clustering machine learning algorithms. As data scientists, you must get familiar with these methods in order to quickly create the datasets for training models using different machine learning algorithms. Methods for generating datasets for Classification Methods for generating datasets for Regression Methods for Generating Datasets for Classification The following is the list of …