Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Scikit-learn vs Tensorflow – When to use What?

scikit learn vs tensorflow

In this post, you will learn about when to use Scikit-learn vs Tensorflow. For data scientists/machine learning enthusiasts, it is very important to understand the difference such that they could use these libraries appropriately while working on different business use cases.  When to use Scikit-learn? Scikit-learn is a great entry point for beginners data scientists. It provides an efficient implementation of many machine learning algorithms. In addition, it is very simple and easy to use. You can get started with Scikit-learn in a very easy manner by using Jupyter notebook. Scikit-learn can be used to solve different kinds of machine learning problems including some of the following: Classification (SVM, nearest neighbors, random …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Machine Learning – Training, Validation & Test Data Set

Training, validation and test data set

In this post, you will learn about the concepts of training, validation, and test data sets used for training machine learning models. The post is most suitable for data science beginners or those who would like to get clarity and a good understanding of training, validation, and test data sets concepts. The following topics will be covered: Data split – training, validation, and test data set  Different model performance based on different data splits Data Splits – Training, Validation & Test Data Sets You can split data into the following different sets and each data split configuration will have machine learning models having different performance: Training data set: When you …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Why use Random Seed in Machine Learning?

random seed value generator

In this post, you will learn about why and when do we use random seed values while training machine learning models. This is a question most likely asked by beginners data scientist/machine learning enthusiasts.  We use random seed value while creating training and test data set. The goal is to make sure we get the same training and validation data set while we use different hyperparameters or machine learning algorithms in order to assess the performance of different models. This is where the random seed value comes into the picture. Different Python libraries such as scikit-learn etc have different ways of assigning random seeds.  While training machine learning models using Scikit-learn, …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Deep Learning – Top 5 Online Jupyter Notebooks Servers

GPU powered Jupyter notebook

In this post, you will get information regarding the online Jupyter notebooks platform (GPU-based) which you can use to get started with both, machine learning and deep learning. The list consists of both freely available and paid options of online Jupyter notebook available with GPUs. When starting with GPUs, it is recommended to use rented options available online rather than buying your own GPU servers. There are online GPU Linux servers available (free and paid options) that can be used to train deep learning & machine learning models. I will be writing about it in my next post.   Here is the list of Jupyter notebook platforms that could be used …

Continue reading

Posted in AI, Deep Learning. Tagged with .

Top Deep Learning Myths You should know

deep learning myths

This post highlights the top deep learning myths you should know. This is important to understand in order to leverage deep learning to solve complex AI problems. Many times, beginner to intermediate level machine learning enthusiasts don’t consider deep learning based on the myths discussed in this post. Without further ado, let’s look at the topmost and most common deep learning myths: Good understanding of complex mathematical concepts: Well, that is just a myth. At times, they say that one needs to have a higher degree in Mathematics & statistics. That is not true. With tools and programming languages along with libraries available today, basic mathematical concepts should be able …

Continue reading

Posted in AI, Deep Learning. Tagged with .

First Principles Understanding based on Physics

First Principles Thinking

In this post, you will understand the concepts of first principles and first principles thinking based on physics concepts. Let’s jump in right away. In the meanwhile, you could also access one of my other posts on the first principles: First-principles thinking explained with examples. It will help you get started on what are first principles and what is first principle thinking. One of the most fundamental Physics concept to understand the first principle is this: Every physical quantity can be represented as the derived quantity or fundamental quantity.  The fundamental quantities, also termed basic quantity, are most basic or fundamental and unique and there are no overlaps between them. …

Continue reading

Posted in Reasoning. Tagged with .

Precision & Recall Explained using Covid-19 Example

Model precision recall accuracy as function of Covid19

In this post, you will learn about the concepts of precision, recall, and accuracy when dealing with the machine learning classification model. Given that this is Covid-19 age, the idea is to explain these concepts in terms of a machine learning classification model predicting whether the patient is Corona positive or not based on the symptoms and other details. The following model performance concepts will be described with the help of examples.  What is the model precision? What is the model recall? What is the model accuracy? What is the model confusion matrix? Which metrics to use – Precision or Recall? Before getting into learning the concepts, let’s look at the data (hypothetical) derived out …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , .

Image Classification & Machine learning

Convolution operation of image and kernel function

In this post, you will learn about how could image classification problems be solved using machine learning techniques. The following are some of the topics which will be covered: How does the computer learn about an image? How could machine learning be used to classify the images? How does the computer learn about an image? Unlike the human beings, the image has to be converted into numbers for computer to learn about the image. So, the question is how can an image be converted into numbers? The most fundamental element or the smallest building block of an image is a pixel. An image can be represented as a set of …

Continue reading

Posted in Machine Learning. Tagged with .

Actionable Insights Examples – Turning Data into Action

data to insights to action - actionable insights examples

In this post, you will learn about how to turn data into information and then to actionable insights with the help of few examples. It will be helpful for data analysts, data scientists, and business analysts to get a good understanding of what is actionable insight? You will understand aspects related to data-driven decision making. Before getting into the details, let’s understand what is the problem at hand? The school authority is trying to assess and improve the health of students. Here is the question it is dealing with: How could we improve the overall health of the students in the school? We will look into the approach of finding the …

Continue reading

Posted in Analytics, Data Science. Tagged with , , .

When to use Deep Learning vs Machine Learning Models?

In this post, you will learn about when to go for training deep learning models from the perspective of model performance and volume of data. As a machine learning engineer or data scientist, it always bothers as to can we use deep learning models in place of traditional machine learning models trained using algorithms such as logistic regression, SVM, tree-based algorithms, etc. The objective of this post is to provide you with perspectives on when to go for traditional machine learning models vs deep learning models.  The two key criteria based on which one can decide whether to go for deep learning vs traditional machine learning models are the following: …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , , .

Most Common Types of Machine Learning Problems

In this post, you will learn about the most common types of machine learning (ML) problems along with a few examples. Without further ado, let’s look at these problem types and understand the details. Regression Classification Clustering Time-series forecasting Anomaly detection Ranking Recommendation Data generation Optimization Problem types Details Algorithms Regression When the need is to predict numerical values, such kinds of problems are called regression problems. For example, house price prediction Linear regression, K-NN, random forest, neural networks Classification When there is a need to classify the data in different classes, it is called a classification problem. If there are two classes, it is called a binary classification problem. …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Historical Dates & Timeline for Deep Learning

deep learning timeline

This post is a quick check on the timeline including historical dates in relation to the evolution of deep learning. Without further ado, let’s get to the important dates and what happened on those dates in relation to deep learning: Year Details/Paper Information Who’s who 1943 An artificial neuron was proposed as a computational model of the “nerve net” in the brain. Paper: “A logical calculus of the ideas immanent in nervous activity,” Bulletin of Mathematical Biophysics, volume 5, 1943 Warren McCulloch, Walter Pitts Late 1950s A neural network application by reducing noise in phone lines was developed Paper: Andrew Goldstein, “Bernard Widrow oral history,” IEEE Global History Network, 1997 Bernard …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

Great Mind Maps for Learning Machine Learning

machine learning mind map

In this post, you will get to look at some of the great mind-maps for learning different machine learning topics. I have gathered these mind maps from different web pages on the Internet. The idea is to reinforce our understanding of different machine learning topics using pictures. You may have heard the proverb – A picture is worth a thousand words.  Keeping this in mind, I thought to pull some of the great mind maps posted on different web pages. I would be updating this blog post from time-to-time.  If you are a beginner data scientist or an experienced one, you may want to bookmark this page for refreshing your …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Different Types of Distance Measures in Machine Learning

Euclidean Distance formula

In this post, you will learn different types of distance measures used in different machine learning algorithms such as K-nearest neighbours, K-means etc. Distance measures are used to measure the similarity between two or more vectors in multi-dimensional space. The following represents different forms of distance metrics / measures: Geometric distances Computational distances Statistical distances Geometric Distance Measures Geometric distance metrics, primarily, tends to measure the similarity between two or more vectors solely based on the distance between two points in multi-dimensional space. The examples of such type of geometric distance measures are Minkowski distance, Euclidean distance and Manhattan distance. One other different form of geometric distance is cosine similarity which will discuss …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Introduction to Algorithms & Related Computational Tasks

Sample-Directed-Acyclic-Graph

In this post, you will be introduced to some of the important class of algorithms and related computational tasks which could be taken care using these algorithms.  Here are some important classes of algorithms which will be briefly discussed in this post: Divide and conquer algorithms Graphs based algorithms Greedy algorithms Dynamic programming Linear programming NP-complete algorithms Quantum algorithms Divide-and-Conquer Algorithms Divide and conquer algorithms are the algorithms which can be used to solve problems using divide and conquer strategy. The following represents the steps of divide-and-conquer algorithms: Breaking it into subproblems that are themselves smaller instances of the same type of problem Recursively solving these subproblems Appropriately combining their …

Continue reading

Posted in Algorithms. Tagged with .

Machine Learning Terminologies for Beginners

ML Terminologies Hypothesis Space

When starting on the journey of learning machine learning and data science, we come across several different terminologies when going through different articles/posts, books & video lectures. Getting a good understanding of these terminologies and related concepts will help us understand these concepts in a nice manner. At a senior level, it gets tricky at times when the team of data scientists / ML engineers explain their projects and related outcomes. With this in context, this post lists down a set of commonly used machine learning terminologies that will help us get a good understanding of ML concepts and also engage with the DS / AI / ML team in …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .