Fixed vs Random vs Mixed Effects Models – Examples

fixed and random effects models

Have you ever wondered what fixed effect, random effect and mixed effects models are? Or, more importantly, how they differ from one another?  In this post, you will learn about the concepts of fixed and random effects models along with when to use fixed effects models and when to go for fixed + random effects (mixed) models. The concepts will be explained with examples. As data scientists, you must get a good understanding of these concepts as it would help you build better linear models such as general linear mixed models or generalized linear mixed models (GLMM).  What are fixed, random & mixed effects models? First, we will take a real-world example and try and understand …

Continue reading

Posted in Data Science, statistics. Tagged with .

CNN Basic Architecture for Classification & Segmentation

image classification object detection image segmentation

As data scientists, we are constantly exploring new techniques and algorithms to improve the accuracy and efficiency of our models. When it comes to image-related problems, convolutional neural networks (CNNs) are an essential tool in our arsenal. CNNs have proven to be highly effective for tasks such as image classification and segmentation, and have even been used in cutting-edge applications such as self-driving cars and medical imaging. Convolutional neural networks (CNNs) are deep neural networks that have the capability to classify and segment images. CNNs can be trained using supervised or unsupervised machine learning methods, depending on what you want them to do. CNN architectures for classification and segmentation include …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , , .

Python – Replace Missing Values with Mean, Median & Mode

Boxplot for deciding whether to use mean, mode or median for imputation

Missing values are common in dealing with real-world problems when the data is aggregated over long time stretches from disparate sources, and reliable machine learning modeling demands for careful handling of missing data. One strategy is imputing the missing values, and a wide variety of algorithms exist spanning simple interpolation (mean. median, mode), matrix factorization methods like SVD, statistical models like Kalman filters, and deep learning methods. Missing value imputation or replacing techniques help machine learning models learn from incomplete data. There are three main missing value imputation techniques – mean, median and mode. Mean is the average of all values in a set, median is the middle number in …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Histogram and Density Plots in Python & R

histogram with different bin widths

In the world of data science, visualizing data is crucial to make sense of the information at hand. One of the most popular ways to visualize data is by using histograms and density plots. These visualizations help us understand the distribution of data and identify patterns that may not be apparent from raw numbers alone. In this blog, we will explore how to create histograms and density plots in two popular programming languages, Python and R.   As a data scientist, it is important to have a good understanding of these visualizations because they allow you to communicate your findings effectively. Histograms and density plots can help you see the …

Continue reading

Posted in Data Science, Python, R. Tagged with , , .

Feature Selection vs Feature Extraction: Machine Learning

Feature extraction vs feature selection

Machine learning has become an increasingly important tool for businesses and researchers alike in recent years. From identifying patterns in data to making predictions about future outcomes, machine learning algorithms are now being used in a wide variety of fields. However, the success of these algorithms often depends on the quality of the features used to train them. This is where the concepts of feature selection and feature extraction come in. In this blog post, we’ll explore the difference between feature selection and feature extraction, two key techniques used in machine learning to optimize feature sets for better model performance. Both feature selection and feature extraction are used for dimensionality …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Keras: Multilayer Perceptron (MLP) Example


Artificial Neural Networks (ANN) have emerged as a powerful tool in machine learning, and Multilayer Perceptron (MLP) is a popular type of ANN that is widely used in various domains such as image recognition, natural language processing, and predictive analytics. Keras is a high-level API that makes it easy to build and train neural networks, including MLPs. In this blog, we will dive into the world of MLPs and explore how to build and train an MLP model using Keras. We will build a simple MLP model using Keras and train it on a dataset. We will explain different aspects of training MLP model using Keras. By the end of …

Continue reading

Posted in Deep Learning, Machine Learning. Tagged with , .

Neural Network & Multi-layer Perceptron Examples

Single layer neural network

Neural networks are an important part of machine learning, so it is essential to understand how they work. A neural network is a computer system that has been modeled based on a biological neural network comprising neurons connected with each other. It can be built to solve machine learning tasks, like classification and regression problems. The perceptron algorithm is a representation of how neural networks work. The artificial neurons were first proposed by Frank Rosenblatt in 1957 as models for the human brain’s perception mechanism. This post will explain the basics of neural networks with a perceptron example. You will understand how a neural network is built using perceptrons. This …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

K-Fold Cross Validation – Python Example

K-Fold Cross Validation Concepts with Python and Sklearn Code Example

In this post, you will learn about K-fold Cross-Validation concepts with Python code examples. K-fold cross-validation is a data splitting technique that can be implemented with k > 1 folds. K-Fold Cross Validation is also known as k-cross, k-fold cross-validation, k-fold CV, and k-folds. The k-fold cross-validation technique can be implemented easily using Python with scikit learn (Sklearn) package which provides an easy way to calculate k-fold cross-validation models.  It is important to learn the concepts of cross-validation concepts in order to perform model tuning with the end goal to choose a model which has a high generalization performance. As a data scientist / machine learning Engineer, you must have a good …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Positively Skewed Probability Distributions: Examples

positively skewed distribution example

Probability distributions are an essential concept in statistics and data analysis. They describe the likelihood of different outcomes or events occurring and provide valuable insights into the characteristics of a given data set. Skewness is an important aspect of probability distributions that can have a significant impact on data analysis and decision-making. In this blog, we will focus on positively skewed probability distributions and explore some real-life examples where these distributions occur. We will discuss what a positively skewed distribution is, what are its different types with formula and definitions. By the end of this blog, you will have a better understanding of positively skewed distributions and be able to …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Maximum Likelihood Estimation: Concepts, Examples

maximum likelihood estimation likelihood function

As data science continues to grow in importance and relevance, so too does the need for tools and techniques that can help extract insights from large, complex datasets. One such tool that is becoming increasingly popular among data scientists is Maximum Likelihood Estimation (MLE). This is becoming more so important to learn fundamentals of MLE concepts as it is at the core of generative modeling (generative AI). MLE is a statistical method used to estimate the parameters of a probability distribution, based on a set of observed data points. MLE is particularly important for data scientists because it underpins many of the probabilistic machine learning models that are used today. …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Generative Modeling in Machine Learning: Examples

generative modeling using RNN

Machine learning has rapidly evolved over the past few years, with new techniques and methods emerging regularly. One of the most exciting and promising areas in this field is generative modeling. Generative modeling refers to the creation of new data samples that are similar to existing data sets. This technique has gained immense popularity in recent times due to its ability to generate highly realistic images, videos, and music. As a data scientist, it is crucial to understand generative modeling and its various applications. This powerful tool has been used in a wide range of fields, including computer vision, natural language processing (NLP), and even drug discovery. By learning generative …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with .

Data Analytics Training Program (Beginners)

Data analytics training

Data analytics has become an integral part of businesses today, helping organizations make data-driven decisions that drive success. To become proficient in data analytics and solve complex business problems, it is essential to have a strong foundation in the key concepts and tools of data analytics. My online courses, which cover topics such as data-driven decision making / decision science, business statistics, python programming, machine learning, and business analytics, are designed to help learners of all levels become experts in these areas. Check out this page for detailed information: Become Data Analytics Pro! Each of these courses is designed to help learners acquire the skills and knowledge necessary to succeed …

Continue reading

Posted in Career Planning, Online Courses. Tagged with .

Histogram Plots using Matplotlib & Pandas: Python

Side by side histogram plots using Matplotlib and Pandas library in Python

Histograms are a graphical representation of the distribution of data. In Python, there are several ways to create histograms. One popular method is to use the Matplotlib library. In this tutorial, we will show you how to create different types histogram plots in Python using Matplotlib. As data scientists, it is important to learn how to create visualizations to communicate our findings. Histograms are one way to do this effectively. What are Histogram plots? Histogram plots are a way of representing the distribution of data. A histogram is made up of bars, with each bar representing a certain range of data values. The height of the bar indicates how many …

Continue reading

Posted in Data, Data Science, statistics. Tagged with , .

Generative vs Discriminative Models: Examples

generative vs discriminative models

The field of machine learning is rapidly evolving, and with it, the concepts and techniques that are used to develop models that can learn from data. Among these concepts, generative and discriminative models are two widely used approaches in the field. Generative models learn the joint probability distribution of the input features and output labels, whereas discriminative models learn the conditional probability distribution of the output labels given the input features. While both models have their strengths and weaknesses, understanding the differences between them is crucial to developing effective machine learning systems. Real-world problems such as speech recognition, natural language processing, and computer vision, require complex solutions that are able …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

NLP Pre-trained Models: Concepts, Examples

NLP pretrained models

The NLP (Natural Language Processing) is a branch of AI with the goal to make machines capable of understanding and producing human language. NLP has been around for decades, but it has recently seen an explosion in popularity due to pre-trained models (PTMs) which can be implemented with minimal effort and time on the side of NLP developers. This blog post will introduce you to different types of pre-trained machine learning models for NLP and discuss their usage in real-world examples. Before we get into looking at different types of pre-trained models for NLP, let’s understand the concepts related to pre-trained models for NLP. What are pre-trained models for NLP? …

Continue reading

Posted in Deep Learning, NLP. Tagged with , .

Accuracy, Precision, Recall & F1-Score – Python Examples

Classification models are used in classification problems to predict the target class of the data sample. The classification model predicts the probability that each instance belongs to one class or another. It is important to evaluate the performance of the classifications model in order to reliably use these models in production for solving real-world problems. Performance measures in machine learning classification models are used to assess how well machine learning classification models perform in a given context. These performance metrics include accuracy, precision, recall, and F1-score. Because it helps us understand the strengths and limitations of these models when making predictions in new situations, model performance is essential for machine learning. …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .