Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog,

Different Types of CNN Architectures Explained: Examples

VGG16 CNN Architecture

Last updated: 4th Dec, 2023. In the fast-paced world of computer vision and image processing, the problem of image classification consistently stands out: the ability to effectively recognize and classify images. As we continue to digitize and automate our world, the demand for systems that can understand and interpret visual data is growing at an unprecedented rate. The challenge is not just about recognizing images – it’s about doing so accurately and efficiently. Traditional machine learning methods often fall short, struggling to handle the complexity and high dimensionality of image data. This is where Convolutional Neural Networks (CNNs) comes to rescue. And, there are different types of CNN architectures based …

Continue reading

Posted in Deep Learning, Machine Learning. Tagged with , .

MongoDB – Commands to Check the Status of MongoDB Database

This article represents different commands which can be used to check the status of MongoDB database on Linux/Ubuntu. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. MongoDB Status Check Commands The following represents some of the commands that can be used to check the status of MongoDB database. Note that mongod represents the daemon process of MongDB databass and, primarily, used to manage database access. It is recommended to check the log file (/var/log/mongo/mongo.log) to get details. Following are some of the commands which can be used to get the status of Mongodb: service mongod status: Displays the status …

Continue reading

Posted in NoSQL. Tagged with , .

Logit vs Probit Models: Differences, Examples

Logit vs probit models

Logit and Probit models are both types of regression models commonly used in statistical analysis, particularly in the field of binary classification. This means that the outcome of interest can only take on two possible values / classes. In most cases, these models are used to predict whether or not something will happen in form of binary outcome. For example, a bank might want to know if a particular borrower might default on loan or otherwise. In this blog post, we will explain what logit and probit models are, and we will provide examples of how they can be used. As data scientists, it is important to understand the concepts …

Continue reading

Posted in Data Science, Machine Learning, statistics. Tagged with , .

Linear Regression Cost Function: Python Example

Cost function in linear regression

Linear regression is a foundational algorithm in machine learning and statistics, used for predicting numerical values based on input data. Understanding the cost function in linear regression is crucial for grasping how these models are trained and optimized. In this blog, we will understand different aspects of cost function used in linear regression including how it does help in building a regression model having high performance. What is a Cost Function in Linear Regression? In linear regression, the cost function quantifies the error between predicted values and actual data points. It is a measure of how far off a linear model’s predictions are from the actual values. The most commonly …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

KNN vs Logistic Regression: Differences, Examples

Difference between K-Nearest Neighbors (KNN) and Logistic Regression algorithms

In this blog, we will learn about the differences between K-Nearest Neighbors (KNN) and Logistic Regression, two pivotal algorithms in machine learning, with the help of examples. The goal is to understand the intricacies of KNN’s instance-based learning and Logistic Regression‘s probability modeling for binary and multinomial outcomes, offering clarity on their core principles. We will also navigate through the practical applications of K-NN and logistic regression algorithms, showcasing real-world examples in various business domains like healthcare and finance. Accompanying this, we’ll provide concise Python code samples, guiding you through implementing these algorithms with datasets. This dual focus on theory and practicality aims to equip you with both the understanding …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Linear Regression vs Logistic Regression: Differences

simple linear regression model 1

Last updated: 1st Dec, 2023 In the ever-evolving landscape of machine learning, two algorithms stand out for their simplicity and effectiveness: Linear Regression and Logistic Regression. But what exactly are these algorithms, and how do they differ from each other? At first glance, logistic regression and linear regression might seem very similar – after all, they share the word “regression.” However, the devil, as they say, is in the details. Each method is uniquely tailored to solve specific types of problems, and understanding these subtleties is key to unlocking their full potential. Linear regression and logistic regression are both machine learning algorithms used for modeling relationships between variables but perform …

Continue reading

Posted in Data Science, Machine Learning, statistics. Tagged with , .

6 Types of Brainstorming Techniques for Ideas Generation

Mind mapping brainstorming ideas

Last updated: 1st Dec, 2023 Generating innovative and creative ideas is a key component of success in many fields, from business and marketing to science, technology, and the arts. However, the process of coming up with new and unique ideas can be challenging, especially when faced with deadlines, limited resources, or creative blocks. This is where brainstorming or mindstorming comes into picture. Fortunately, there are several different types of brainstorming techniques that can help individuals and teams generate great ideas and innovate. While brainstorming is one of the most effective techniques out there, not all brainstorming sessions are created equal. The question that is frequently asked is how to brainstorm for effective …

Continue reading

Posted in News. Tagged with .

Python – How to Create Scatter Plot with IRIS Dataset


Last updated: 1st Dec, 2023 In this blog post, we will be learning how to create a Scatter Plot with the IRIS dataset using Python. The IRIS dataset is a collection of data that is used to demonstrate the properties of various statistical models. It contains information about 50 observations on four different variables: Petal Length, Petal Width, Sepal Length, and Sepal Width. As data scientists, it is important for us to be able to visualize the data that we are working with. Scatter plots are a great way to do this because they show the relationship between two variables. In this post, we learn how to plot IRIS dataset …

Continue reading

Posted in Data Science, Python. Tagged with , , .

F-statistics in Linear Regression: Formula, Examples

linear regression R-squared concepts

Last updated: 1st Dec, 2023 In this blog post, we will take a look at the concepts and formula of f-statistics in linear regression models and understand how to interpret f-statistics in regression with the help of examples. F-test and related F-statistics interpretation is key if you want to be able to evaluate the regression models based on the summary results of training the model. We will start by discussing the importance of f-statistics in linear regression models and understand how they are calculated based on the f-statistics formula. We will, then, understand the concept with some real-world examples. As data scientists, it is very important to understand both the f-statistics …

Continue reading

Posted in Data Science, Machine Learning, statistics. Tagged with , , .

Python – Replace Missing Values with Mean, Median & Mode

Boxplot for deciding whether to use mean, mode or median for imputation

Last updated: 1st Dec, 2023 Have you found yourself asking question such as how to deal with missing values in data analysis stage? When working with Python, have you been troubled with question such as how to replace missing values in Pandas data frame? Well, missing values are common in dealing with real-world problems when the data is aggregated over long time stretches from disparate sources, and reliable machine learning modeling demands for careful handling of missing data. One strategy is imputing the missing values, and a wide variety of algorithms exist spanning simple interpolation (mean, median, mode), matrix factorization methods like SVD, statistical models like Kalman filters, and deep …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Accuracy, Precision, Recall & F1-Score – Python Examples

Last updated: 30th Nov, 2023 Classification models are used in classification problems to predict the target class of the data sample. The classification machine learning models predicts the probability that each instance belongs to one class or another. It is important to evaluate the performance of the classifications model in order to reliably use these models in production for solving real-world problems. The performance metrics include accuracy, precision, recall, and F1-score. Because it helps us understand the strengths and limitations of these models when making predictions in new situations, model performance is essential for machine learning. The most common question asked is what is accuracy, precision, recall and f1 score? In …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Chebyshev’s Theorem: Formula & Examples

chebyshev theorem for standard deviation

Chebyshev’s theorem is a fundamental concept in statistics that allows us to determine the probability of data values falling within a certain range defined by mean and standard deviation. This theorem makes it possible to calculate the probability of a given dataset being within K standard deviations away from the mean. It is important for data scientists, statisticians, and analysts to understand this theorem as it can be used to assess the spread of data points around a mean value. What is Chebyshev’s Theorem? Chebyshev’s Theorem, also known as Chebyshev’s Rule, states that in any probability distribution, the proportion of outcomes that lie within k standard deviations from the mean …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Independent Samples T-test: Formula & Examples

independent samples t-test

Last updated: 30th Nov, 2023 As a data scientist, you may often come across scenarios where you need to compare the means of two independent samples. In such cases, a two independent samples t-test, also known as unpaired two samples t-test, is an essential statistical tool that can help you draw meaningful conclusions from your data. This test allows you to determine whether the difference between the means of two independent samples is statistically significant or due to chance. In this blog, we will cover the concept of independent samples t-test, its formula, real-world examples of its applications and the Python & Excel example (using scipy.stats.ttest_ind function). We will begin …

Continue reading

Posted in Data Science, statistics. Tagged with , .

AIC in Logistic Regression: Formula, Example

Model evaluation using AIC in Logistic Regression

Have you as a data scientist ever been challenged by choosing the best logistic regression model for your data? As we all know, the difference between a good and the best model while training machine learning model can be subtle yet impactful. Whether it’s predicting the likelihood of an event occurring or classifying data into distinct categories, logistic regression provides a robust framework for analysts and researchers. However, the true power of logistic regression is harnessed not just by building models, but also by selecting the right model. This is where the Akaike Information Criterion (AIC) comes into play. In this blog, we’ll delve into different aspects of AIC, decode …

Continue reading

Posted in Data Science, Machine Learning, Python, R. Tagged with , .

Linear Regression T-test: Formula, Example

Linear regression line slope 0

Last updated: 29th Nov, 2023 Linear regression is a popular statistical method used to model the relationship between a dependent variable and one or more independent variables. In linear regression, the t-test is a statistical hypothesis testing technique that is used to test the hypothesis related to linearity of the relationship between the response variable and different predictor variables. In this blog, we will discuss linear regression and t-test and related formulas and examples. For a detailed read on linear regression, check out my related blog – Linear regression explained with real-life examples. T-tests are used in linear regression to determine if a particular variable is statistically significant in the …

Continue reading

Posted in Data Science, Python, R, statistics. Tagged with , , , .

Types of SQL Joins: Differences, SQL Code Examples

SQL Joins explained using Sets

Structured Query Language (SQL) is one of the most important and widely used tools for data manipulation. It allows users to interact with databases, query and manipulate data, and create reports. One of SQL’s most important features is its ability to join tables together in order to enrich, compare and analyze related data. These joins are termed as inner join, outer join, left join and right join. In this article, we will discuss the different types of joins available in SQL, their differences and provide examples of how each can be used. What is SQL Join? SQL Joins are a technique used in Structured Query Language (SQL) to combine two …

Continue reading

Posted in Data, Data analytics, Database. Tagged with , .