Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Bagging Classifier Python Code Example

November 25, 2023 by Ajitesh Kumar · Leave a comment

Bagging Classifier explained with Python code examples

Last updated: 25th Nov, 2023 Bagging is a type of an ensemble machine learning approach that combines the outputs from many learner to improve performance. The bagging algorithm works by dividing the training set into smaller subsets. These subsets are then processed through different machine-learning models. After processing, the predictions from each model are combined. This combination of predictions is used to generate an overall prediction for each instance in the original data. In this blog post, you will learn about the concept of Bagging along with Bagging Classifier Python code example. Bagging can be used in machine learning for both classification and regression problem. The bagging classifier technique is utilized across a …

Continue reading →

Posted in Data Science, Machine Learning, Python. Tagged with Data Science, machine learning, python, sklearn.

Activation Functions in Neural Networks: Concepts, Examples

November 24, 2023 by Ajitesh Kumar · 1 Comment

Last updated: 24th Nov, 2023 The activation functions are critical to understanding neural networks. There are many activation functions available for data scientists to choose from, when training neural networks. So, it can be difficult to choose which activation function will work best for their needs. In this blog post, we look at different activation functions and provide examples of when they should be used in different types of neural networks. If you are starting on deep learning and wanted to know about different types of activation functions, you may want to bookmark this page for quicker access in the future. What are activation functions in neural networks? In a …

Continue reading →

Posted in Deep Learning, Machine Learning. Tagged with Data Science, Deep Learning, machine learning.

PCA Explained Variance Concepts with Python Example

November 24, 2023 by Ajitesh Kumar · Leave a comment

Last updated: 24th Nov, 2023 Dimensionality reduction is an important technique in data analysis and machine learning that allows us to reduce the number of variables in a dataset while retaining the most important information. By reducing the number of variables, we can simplify the problem, improve computational efficiency, and avoid overfitting. Principal Component Analysis (PCA) is a popular dimensionality reduction technique that aims to transform a high-dimensional dataset into a lower-dimensional space while retaining most of the information. PCA works by identifying the directions that capture the most variation in the data and projecting the data onto those directions, which are called principal components. However, when we apply PCA, …

Continue reading →

Posted in Data Science, Machine Learning, Python. Tagged with Data Science, machine learning, python.

R-squared & Adjusted R-squared: Differences, Examples

November 23, 2023 by Ajitesh Kumar · 1 Comment

r-squared vs adjusted r-squared

There are two measures of the strength of linear regression models: adjusted r-squared and r-squared. While they are both important, they measure different aspects of model fit. In this blog post, we will discuss the differences between adjusted r-squared and r-squared, as well as provide some examples to help illustrate their meanings. As a data scientist, it is of utmost importance to understand the differences between adjusted r-squared and r-squared in order to select the most appropriate linear regression model out of different regression models. What is R-squared? R-squared, also known as the coefficient of determination, is a measure of what proportion of the variance in the value of the …

Continue reading →

Posted in Data Science, Machine Learning. Tagged with Data Science, machine learning.

Feature Scaling in Machine Learning: Python Examples

November 23, 2023 by Ajitesh Kumar · Leave a comment

While training machine learning models, we come across the need for scaling features in order to have different features contribute to the predictions in an appropriate manner. Without scaling, features with larger numerical ranges can dominate those with smaller ranges, leading to biased or inefficient learning. In this post you will learn about this feature engineering technique namely feature scaling with Python code examples using which you could significantly improve performance of machine learning models. To demonstrate the technique, the models will be trained using Perceptron (single-layer neural network) classifier. What is Feature Scaling? Why is it needed? Feature scaling is a method used to standardize the range of independent variables …

Continue reading →

Posted in AI, Data Science, Machine Learning. Tagged with Data Science, machine learning, python.

Different Types of Statistical Tests: Concepts

November 18, 2023 by Ajitesh Kumar · 5 Comments

different types of statistical tests

Last updated: 18th Nov, 2023 Statistical tests are an important part of data analysis. They help us understand the data and make inferences about the population. They are used to examine relationships between variables based on hypothesis testing. They are a way of analyzing data to see if there is a significant difference between the two groups or a group and population. In statistics, there are two main types of tests: parametric and non-parametric. Both types of tests are used to make inferences about a population based on a sample. The difference between the two types of tests lies in the assumptions that they make about the data. Parametric tests …

Continue reading →

Posted in Data Science, statistics. Tagged with Data Science, statistics.

Machine Learning – Sensitivity vs Specificity Differences, Examples

November 18, 2023 by Ajitesh Kumar · 3 Comments

sensitivity vs specificity vs ROC vs AUC

Last updated: 18th Nov, 2023 Machine learning (ML) models are increasingly being used to learn from data and make decisions or predictions based on that learning. When it comes to evaluating the performance of these ML models, there are several important metrics to consider. One of the most important metrics is the accuracy of the model, which is typically measured using sensitivity and specificity. Sensitivity and specificity are two important concepts often used in the context of classification tasks in machine learning. They help to evaluate the performance of a classification model. In this post, we will try and understand the concepts behind machine learning model evaluation metrics such as …

Continue reading →

Posted in Data Science, Machine Learning. Tagged with Data Science, machine learning.

PCA vs LDA Differences, Plots, Examples

November 18, 2023 by Ajitesh Kumar · Leave a comment

PCA plot for IRIS dataset

Last updated: 18th Nov, 2023 Dimensionality reduction is an important technique in data analysis and machine learning that allows us to reduce the number of variables in a dataset while retaining the most important information. By reducing the number of variables, we can simplify the problem, improve computational efficiency, and avoid overfitting. Two popular dimensionality reduction techniques are Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Both techniques aim to reduce the dimensionality of the dataset, but they differ in their objectives, assumptions, and outputs. But how do they differ, and when should you use one method over the other? As data scientists, it is important to get a …

Continue reading →

Posted in Data Science, Machine Learning, Python. Tagged with Data Science, machine learning, python.

Types & Uses of Moments in Statistics

November 18, 2023 by Ajitesh Kumar · Leave a comment

fourth moment kurtosis

Last updated: 18th Nov, 2023 In statistics, moments are measures of the shape and variability of a data set. They are used to describe the location and dispersion of the data. There are several types of moments that can be calculated, each providing different information about the data set. Let’s take a look at some of these moments, its definitions, formula and examples highlighting how they can be used in statistical analysis. What are Moments in Statistics and what are their types? In statistics, moments are an important tool used to measure the characteristics of a distribution. Moments can provide useful information about the spread, shape, and center of a …

Continue reading →

Posted in Data Science, statistics. Tagged with Data Science, statistics.

How to Add Rows to DataFrames in R Using dplyr: Examples

November 17, 2023 by Ajitesh Kumar · Leave a comment

Add Row to R Dataframe using dplyr

Data manipulation is a fundamental aspect of data analysis, and R, with its dplyr package, offers an efficient and readable way to perform such tasks. In my experience working with various datasets, I have often encountered situations where I needed to add rows to an existing DataFrame. The dplyr package, part of the tidyverse collection, makes these tasks intuitive and efficient. In this blog post, I’ll share two common scenarios: adding a single row and adding multiple rows to a DataFrame using dplyr. If you would want to learn about how to add rows to Pandas Dataframe using Python, check out my related post – Pandas Dataframe: How to Add …

Continue reading →

Posted in Data Science, R. Tagged with Data Science, r programming.

Data Ingestion Types – Concepts & Examples

November 17, 2023 by Ajitesh Kumar · Leave a comment

data ingestion types

Last updated: 17th Nov, 2023 Data ingestion is the process of moving data from its original storage location to a data warehouse or other database for analysis. Data engineers are responsible for designing and managing data ingestion pipelines. Data can be ingested in different modes such as real-time, batch mode, etc. In this blog, we will learn the concepts about different types of data ingestion with the help of examples. What is Data Ingestion? Data ingestion is the foundational process of importing, transferring, loading, and processing data from various sources into a storage medium where it can be accessed, used, and analyzed by an organization. It’s akin to the first …

Continue reading →

Posted in Data, data engineering. Tagged with data, data engineering, Data Ingestion.

Two samples Z-test for Means: Formula & Examples

November 16, 2023 by Ajitesh Kumar · 2 Comments

two-samples z-test for means

Last updated: 21st Nov, 2023 Statistical hypothesis testing is an essential tool in inferential statistics that enables researchers to make informed decisions about the population parameters based on sample statistics. One common hypothesis test for comparing two sample means is the Two-Sample Z-Test. In statistics, a two-sample z-test for means is used to determine if the means of two populations are equal. This test is used when the population standard deviations are known. As data scientists, it is of utmost importance to be able to understand and conduct this test accurately. In this blog, we will delve deeper into the Two-Sample Z-Test for means, exploring its formula, assumptions, and examples …

Continue reading →

Posted in Data Science, statistics. Tagged with Data Science, statistics.

Histogram Plots using Matplotlib & Pandas: Python

November 16, 2023 by Ajitesh Kumar · Leave a comment

Side by side histogram plots using Matplotlib and Pandas library in Python

Executing the above code will print the following Histogram. Plotting multiple Histograms Side-by-Side using Matplotlib & Pandas When you want to understand the distribution of data with respect to different characteristics, you could plot the side-by-side or multiple histograms on the same plot. For example, when you want to understand the distribution of housing prices with respect to different values of accessibility to radial highways, you would want to print the histograms side-by-side on the same plot. Here is the code representing the printing of histogram plots side-by-side on the same plot: Here is how the side-by-side histogram plot would look like: Creating Stacked Histogram Plots using Matplotlib & Pandas …

Continue reading →

Posted in Data, Data Science, statistics. Tagged with Data Science, statistics.

Confusion Matrix Concepts, Python Code Examples

November 15, 2023 by Ajitesh Kumar · 1 Comment

Confusion Matrix IRIS Dataset Example

The confusion matrix is an essential tool in the field of machine learning and statistics for evaluating the performance of a classification model. It’s particularly useful when dealing with binary or multi-class classification problems. In this post, you will learn about the confusion matrix with examples and how it could be used as performance metrics for classification models in machine learning. What is Confusion Matrix? A confusion matrix is a table used to describe the performance of a classification model on a set of test data for which the true values are known. It’s most useful when you need to know more about the accuracy of the model than just …

Continue reading →

Posted in AI, Data Science, Machine Learning. Tagged with Data Science, machine learning, python, sklearn.

Maximum Likelihood Estimation: Concepts, Examples

November 15, 2023 by Ajitesh Kumar · Leave a comment

maximum likelihood estimation likelihood function

Maximum Likelihood Estimation (MLE) is a fundamental statistical method for estimating the parameters of a statistical model that make the observed data most probable. MLE is grounded in probability theory, providing a strong theoretical basis for parameter estimation. This is becoming more so important to learn fundamentals of MLE concepts as it is at the core of generative modeling (generative AI). Many models used in machine learning and statistics are based on MLE, including logistic regression, survival models, and various types of machine learning algorithms. MLE is particularly important for data scientists because it underpins many of the probabilistic machine learning models that are used today. These models, which are …

Continue reading →

Posted in Data Science, Machine Learning. Tagged with Data Science, machine learning, statistics.

Wilcoxon Signed Rank Test: Concepts, Examples

November 14, 2023 by Ajitesh Kumar · Leave a comment

wilcoxon signed rank test

How can data scientists accurately analyze data when faced with non-normal distributions or small sample sizes? This is a challenge that often arises in the dynamic field of data science, where making precise inferences is crucial. Enter the Wilcoxon Signed Rank Test—a non-parametric statistical method that stands as a powerful alternative to the traditional t-test. This blog post aims to unravel the concepts and practical applications of the Wilcoxon Signed Rank Test, offering key insights for data scientists and researchers navigating complex data landscapes. The beauty of the Wilcoxon Signed Rank Test lies in its wide applicability across numerous fields. From healthcare, where it can compare the efficacy of different …

Continue reading →

Posted in Data Science, Python, statistics. Tagged with Data Science, python, statistics.

Welcome to Vitalflux.com - your hub for AI, Machine Learning, Data Science and Data Analytics topics. Learn through detailed, real-life examples in AI/ML and Data Management. Gain practical insights and apply them to real-world scenarios!

Data Science
Machine Learning
Deep Learning
Statistics
Generative AI

Courses
Admissions
Interview Questions
Educational Presentations

Privacy policy
Contact us

Analytics Yogi © 2025

Powered by WordPress. Design by WildWebLab