Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking

Python – Creating Scatter Plot with IRIS Dataset


In this blog post, we will be learning how to create a Scatter Plot with the IRIS dataset using Python. The IRIS dataset is a collection of data that is used to demonstrate the properties of various statistical models. It contains information about 50 observations on four different variables: Petal Length, Petal Width, Sepal Length, and Sepal Width. As data scientists, it is important for us to be able to visualize the data that we are working with. Scatter plots are a great way to do this because they show the relationship between two variables. In this post, we have plotted and explored how how Petal Length and Sepal Length …

Continue reading

Posted in Data Science, Python. Tagged with , , .

Supervised & Unsupervised Learning Difference

Supervised vs Unsupervised Machine Learning Problems

Supervised and unsupervised learning are two different common types of machine learning tasks that are used to solve many different types of business problems. Supervised learning uses training data with labels to create supervised models, which can be used to predict outcomes for future datasets. Unsupervised learning is a type of machine learning task where the training data is not labeled or categorized in any way. For beginner data scientists, it is very important to get a good understanding of the difference between supervised and unsupervised learning. In this post, we will discuss how supervised and unsupervised algorithms work and what is difference between them. You may want to check …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , .

Logit vs Probit Models: Differences, Examples

Logit vs probit models

Logit and probit models are statistical models that are used to model binary or dichotomous dependent variables. This means that the outcome of interest can only take on two possible values. In most cases, these models are used to predict whether or not something will happen. For example, a business might want to know if a particular advertising campaign will lead to an increase in sales. In this blog post, we will explain what logit and probit models are, and we will provide examples of how they can be used. As data scientists, it is important to understand the concepts of logit and probit models and when should they be …

Continue reading

Posted in Data Science, Machine Learning, statistics. Tagged with , .

Categorical Data Visualization: Concepts, Examples

bar chart data visualization for categorical data

Everyone knows that data visualization is one of the most important tools for any data scientist or statistician. It helps us to better understand the relationships between variables and identify patterns in our data. There are specific types of visualization used to represent categorical data. This type of data visualization can be incredibly helpful when it comes to analyzing our data and making predictions about future trends. In this blog, we will dive into what categorical data visualization is, why it’s useful, and some examples of how it can be used. Types of Data Visualizations for Categorical Dataset When it comes to visualizing categorical data sets, there are primarily four …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Types of Probability Distributions: Codes, Examples

uniform probability distribution plot

In this post, you will learn the definition of 25 different types of probability distributions. Probability distributions play an important role in statistics and in many other fields, such as economics, engineering, and finance. They are used to model all sorts of real-world phenomena, from the weather to stock market prices. Before we get into understanding different types of probability distributions, let’s understand some fundamentals. If you are a data scientist, you would like to go through these distributions. This page could also be seen as a cheat sheet for probability distributions. What are Probability Distributions? Probability distributions are a way of describing how likely it is for a random …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , , .

Cross Entropy Loss Explained with Python Examples

In this post, you will learn the concepts related to the cross-entropy loss function along with Python code examples and which machine learning algorithms use the cross-entropy loss function as an objective function for training the models. Cross-entropy loss is used as a loss function for models which predict the probability value as output (probability distribution as output). Logistic regression is one such algorithm whose output is a probability distribution. You may want to check out the details on how cross-entropy loss is related to information theory and entropy concepts – Information theory & machine learning: Concepts What’s Cross-Entropy Loss? Cross-entropy loss, also known as negative log likelihood loss, is …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Accuracy, Precision, Recall & F1-Score – Python Examples

Classification models are used in classification problems to predict the target class of the data sample. The classification model predicts the probability that each instance belongs to one class or another. It is important to evaluate the performance of the classifications model in order to reliably use these models in production for solving real-world problems. Performance measures in machine learning classification models are used to assess how well machine learning classification models perform in a given context. These performance metrics include accuracy, precision, recall, and F1-score. Because it helps us understand the strengths and limitations of these models when making predictions in new situations, model performance is essential for machine learning. …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Data Variables Types & Uses in Data Science

Types of variables in data science

In data science, variables are the building blocks of any analysis. They allow us to group, compare, and contrast data points to uncover trends and draw conclusions. But not all variables are created equal; there are different types of variables that have specific uses in data science. In this blog post, we’ll explore the different variable types and their uses in data science. The picture below represents different types of variables one can find when working on statistics / data science projects: Lets understand each types of variables in the following sections. Categorical / Qualitative Variables Categorical variables are a type of data that can be grouped into categories, based …

Continue reading

Posted in Data, Data Science, statistics. Tagged with .

Analytical thinking & Reasoning: Real-life Examples

analytical thinking 1

Analytical thinking and analytical reasoning are two concepts that are often misunderstood. Many people think that they are the same thing, but this is not the case. In fact, analytical thinking and analytical reasoning are two very different things, however, related. Analytical thinking is an important aspect of analytical skills. Most of us do not realize how to use analytical thinking and often end up solving the problem incorrectly or half-heartedly. As data analysts or data scientists, it would be of utmost importance to acquire this skill well. In this blog post, we will learn these concepts with the help of some real-life examples. What is analytical thinking? Analytical thinking …

Continue reading

Posted in Data analytics. Tagged with , .

AI Product Manager Interview Questions

interview questions for machine learning

AI has become such an integral part of our lives that it is important to hire professionals who can help create AI / machine learning products that will be used by many people. These AI product manager interview questions will give you insight into your product manager candidate’s experience, skills, and industry knowledge so that you can get prepared in a better manner before appearing for your next interview as an AI product manager. Check out a detailed interview questions and answers with greater focus on machine learning topics. Before getting into the list of interview questions, lets understand what can be the job description of an AI product manager. …

Continue reading

Posted in AI, Career Planning, Interview questions, Machine Learning, Product Management. Tagged with , , .

Climate Change Initiatives at Global Level

The effects of climate change are becoming more severe as each day passes. With the future of our planet at stake, it’s important to understand the global initiatives currently in place to reduce carbon emissions and mitigate the effects of climate change. In this blog post, we’ll look at some of the most important initiatives being taken around the world to address this critical issue. At some point, we will start using data from these initiatives for discussions on how businesses and governments could work together to save our planet while leveraging data analytics solutions.  UNFCCC (United Nations Framework Convention on Climate Change) UNFCCC is an international treaty that sets …

Continue reading

Posted in Climate Change, ESG. Tagged with , .

Population & Samples in Statistics: Examples

characteristics of a sample

In statistics, population and sample are two fundamental concepts that help us to better understand data. A population is a complete set of objects from which we can obtain data. A population can include all people, animals, plants, or things in a given area. On the other hand, a sample is a subset of the population that is used for observation and analysis. In this blog, we will further explore the concepts of population and samples and provide examples to illustrate the differences between them in statistics. What is a population in statistics? In statistics, population refers to the entire set of objects or individuals about which we want to …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Data Governance Goals Explained with Examples

data governance goals and objectives

Data governance is an important element of any organization’s data management strategy. It provides a framework for creating, managing and monitoring data within an organization. This ensures that the data is accurate, consistent, secure, and compliant with all relevant regulations. It also allows organizations to make informed decisions based on quality data. In this blog post, we will explore the goals and objectives of good data governance and provide some examples to help you implement it in your organization. Here is the picture representing the most important goals of data governance: Protect the needs of data stakeholders Data governance helps protect the needs of data stakeholders by ensuring that the …

Continue reading

Posted in Data, Data Governance, Data management. Tagged with , , .

Climate Analysis & Top Questions for Leadership

climate change analysis and top questions

As a business leader, it is important to understand the impact of climate change on business and society at large and the importance of performing climate analysis. Climate analysis can provide useful insights on how your business can be more sustainable and efficient in their operations while building a resilient business. This blog will mention some of the key questions that leadership should consider when evaluating their current climate change related strategies. What is climate change impact analysis? Climate change analysis can be defined as an assessment of the existing environment in which a business operates and how it impacts the sustainability of the organization. It looks at everything from …

Continue reading

Posted in Climate Change, ESG. Tagged with , .

Types of SQL Joins Explained with Examples

SQL Joins explained using Sets

Structured Query Language (SQL) is one of the most important and widely used tools for data manipulation. It allows users to interact with databases, query and manipulate data, and create reports. One of SQL’s most important features is its ability to join tables together in order to enrich, compare and analyze related data. In this article, we will discuss the different types of joins available in SQL and provide examples of how each can be used. What is SQL Join? SQL Joins are a technique used in Structured Query Language (SQL) to combine two separate tables into a single table. This is done by establishing relationships between the tables based …

Continue reading

Posted in Data, Data analytics, Database. Tagged with , .

Types of Frequency Distribution & Examples

frequency distribution plot for continuous quantitative variables

Frequency distributions are an important tool for data scientists, statisticians, and other professionals who work with data. Frequency distributions help to organize and summarize data, making it easier to identify the behavior of the data including patterns and trends. Evaluating frequency distribution is one of the important technique of univariate descriptive statistics. In this article, we’ll take a look at the concepts of the frequency distribution, its different types and provide some examples of each. What is Frequency Distribution? Frequency distribution is a statistical tool used to represent the frequency with which different categories of a qualitative or quantitative variable occur. It provides an overview of the data and allows …

Continue reading

Posted in statistics. Tagged with .