Category Archives: statistics

Survival Analysis Modeling for Customer Churn

survival analysis customer churn

Customer churn is a prevalent problem for many businesses. It can happen in several different ways, such as when customers stop using the product, or when they leave because of an issue with customer service. This blog post will explore survival analysis modeling and what it can do to help you better understand customer churn problems. First, we will discuss survival analysis itself and why it is beneficial for analyzing customer behavior. Then we will show some examples on how survival analysis has been used to analyze customer churn problems. As data scientists, it will be good to familiarize ourselves with survival analysis, as it is a popular modeling technique …

Continue reading

Posted in Data Science, statistics. Tagged with , , .

Binomial Distribution Explained with Examples

binomial experiment coin tossing 100 experiments 50 trials

The binomial distribution is a probability distribution that applies to binomial experiments. It’s the number of successes in a specific number of tries. The binomial distribution may be imagined as the probability distribution of a number of heads that appear on a coin flip in a specific experiment comprising of a fixed number of coin flips. In this blog post, we will learn binomial distribution with the help of examples. If you are an aspiring data scientist looking forward to learning/understand the binomial distribution in a better manner, this post might be very helpful. What is a Binomial Distribution? The binomial distribution is a discrete probability distribution that represents the probabilities of binomial random …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , , .

Fixed vs Random vs Mixed Effects Models – Examples

fixed and random effects models

Have you ever wondered what fixed effect, random effect and mixed effects models are? Or, more importantly, how they differ from one another?  In this post, you will learn about the concepts of fixed and random effects models along with when to use fixed effects models and when to go for fixed + random effects (mixed) models. The concepts will be explained with examples. As data scientists, you must get a good understanding of these concepts as it would help you build better linear models such as general linear mixed models or generalized linear mixed models (GLMM).  What are fixed, random & mixed effects models? First, we will take a real-world example and try and understand …

Continue reading

Posted in Data Science, statistics. Tagged with .

Poisson Distribution Explained with Python Examples

Poisson distribution is a probability distribution that can be used to model the number of events in a fixed interval. It is often referred to as “random poisson process” or “poisson process”. The poisson distribution describes how many occurrences of an event occur within a given time frame, for example, how many customers visit your store or restaurant every hour. In this post, you will learn about the concepts of Poisson probability distribution with Python examples. As a data scientist, you must get a good understanding of the concepts of probability distributions including normal, binomial, Poisson etc.  What is Poisson distribution? Poisson distribution is the discrete probability distribution which represents the …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Negative Binomial Distribution Python Examples

Negative Binomial Probability Distribution

In this post, you will learn about the concepts of negative binomial distribution explained using real-world examples and Python code. We will go over some of the following topics to understand negative binomial distribution: What is negative binomial distribution? What is difference between binomial and negative binomial distribution? Negative binomial distribution real-world examples Negative binomial distribution Python example What is Negative Binomial Distribution? Negative binomial distribution is a discrete probability distribution representing the probability of random variable, X, which is number of Bernoulli trials required to have r number of successes. This random variable is called as negative binomial random variable. And, the experiment representing X number of Bernoulli trials required to product r successes is called …

Continue reading

Posted in statistics. Tagged with .

Geometric Distribution Explained with Python Examples

In this post, you will learn about the concepts of Geometric probability distribution with the help of real-world examples and Python code examples. It is of utmost importance for data scientists to understand and get an intuition of different kinds of probability distribution including geometric distribution. You may want to check out some of my following posts on other probability distribution. Normal distribution explained with Python examples Binomial distribution explained with 10+ examples Hypergeometric distribution explained with 10+ examples In this post, the following topics have been covered: Geometric probability distribution concepts Geometric distribution python examples Geometric distribution real-world examples Geometric Probability Distribution Concepts Geometric probability distribution is a discrete …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Python – How to Add Trend Line to Line Chart / Graph

Chris Gayle - Rohit Sharma - Dhoni - Virat Kohli IPL Batting Average Score Trendline

In this plot, you will learn about how to add trend line to the line chart / line graph using Python Matplotlib.As a data scientist, it proves to be helpful to learn the concepts and related Python code which can be used to draw or add the trend line to the line charts as it helps understand the trend and make decisions. In this post, we will consider an example of IPL average batting scores of Virat Kohli, Chris Gayle, MS Dhoni and Rohit Sharma of last 10 years, and, assess the trend related to their overall performance using trend lines. Let’s say that main reason why we want to …

Continue reading

Posted in Python, statistics. Tagged with , , .

Beta Distribution Explained with Python Examples

In this post, you will learn about Beta probability distribution with the help of Python examples. As a data scientist, it is very important to understand beta distribution as it is used very commonly as prior in Bayesian modeling. In this post, the following topics get covered: Beta distribution intuition and examples Introduction to beta distribution Beta distribution python examples Beta Distribution Intuition & Examples Beta distribution is widely used to model the prior beliefs or probability distribution in real world applications. Here is a great article on understanding beta distribution with an example of baseball game. You may want to pay attention to the fact that even if the baseball …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Bernoulli Distribution Explained with Python Examples

In this post, you will learn about the concepts of Bernoulli Distribution along with real-world examples and Python code samples. As a data scientist, it is very important to understand statistical concepts around various different probability distributions to understand the data distribution in a better manner. In this post, the following topics will get covered: Introduction to Bernoulli distribution Bernoulli distribution real-world examples Bernoulli distribution python code examples Introduction to Bernoulli Distribution Bernoulli distribution is a discrete probability distribution representing the discrete probabilities of a random variable which can take only one of the two possible values such as 1 or 0, yes or no, true or false etc. The probability of …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Bayes Theorem Explained with Examples

In this post, you will learn about Bayes’ Theorem with the help of examples. It is of utmost importance to get a good understanding of Bayes Theorem in order to create probabilistic models. Bayes’ theorem is alternatively called as Bayes’ rule or Bayes’ law. One of the many applications of Bayes’s theorem is Bayesian inference which is one of the approaches of statistical inference (other being Frequentist inference), and fundamental to Bayesian statistics. In this post, you will learn about the following: Introduction to Bayes’ Theorem Bayes’ theorem real-world examples Introduction to Bayes’ Theorem In simple words, Bayes Theorem is used to determine the probability of a hypothesis in the presence of more evidence or information. In other …

Continue reading

Posted in Bayesian, Data Science, statistics. Tagged with .

Joint & Conditional Probability Explained with Examples

In this post, you will learn about joint and conditional probability differences and examples. When starting with Bayesian analytics, it is very important to have a good understanding around probability concepts. And, the probability concepts such as joint and conditional probability is fundamental to probability and key to Bayesian modeling in machine learning. As a data scientist, you must get a good understanding of probability related concepts. Joint & Conditional Probability Concepts In this section, you will learn about basic concepts in relation to Joint and conditional probability. Probability of an event can be quantified as a function of uncertainty of whether that event will occur or not. Let’s say an event A is …

Continue reading

Posted in Bayesian, Data Science, statistics. Tagged with , .

What, When & How of Scatterplot Matrix in Python

In this post, you will learn about some of the following in relation to scatterplot matrix. Note that scatter plot matrix can also be termed as pairplot. Later in this post, you would find Python code example in relation to using scatterplot matrix / pairplot (seaborn package). What is scatterplot matrix? When to use scatterplot matrix / pairplot? How to use scatterplot matrix in Python? What is Scatterplot Matrix? Scatter plot matrix is a matrix (or grid) of scatter plots where each scatter plot in the grid is created between different combinations of variables. In other words, scatter plot matrix represents bi-variate or pairwise relationship between different combinations of variables …

Continue reading

Posted in Data Science, Python, statistics. Tagged with , , , .

Hypergeometric Distribution Explained with 10+ Examples

In this post, we will learn Hypergeometric distribution with 10+ examples. The following topics will be covered in this post: What is Hypergeometric Distribution? 10+ Examples of Hypergeometric Distribution If you are an aspiring data scientist looking forward to learning/understand the binomial distribution in a better manner, this post might be very helpful. The Binomial distribution can be considered as a very good approximation of the hypergeometric distribution as long as the sample consists of 5% or less of the population. One would need a good understanding of binomial distribution in order to understand the hypergeometric distribution in a great manner. I would recommend you take a look at some of my related posts on …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , .

Binomial Distribution with Python Code Examples

sample-binomial-distribution-plot

In this code, you will learn code examples, written with Python Numpy package, related to the binomial distribution. You may want to check out the post, Binomial Distribution explained with 10+ examples to get an understanding of Binomial distribution with the help of several examples. All of the examples could be tried with code samples given in this post. Here are the instructions: Load the Numpy package: First and foremost, load the Numpy and Seaborn library Code Syntax – np.random.binomial(n, p, size=1): The code np.random.binomial(n, p, size=1) will be used to print the number of successes that will happen in one (size=1) experiment comprising of n number of trials with probability/proportion of success being p. Tossing a …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , .

Beta Distribution Example for Cricket Score Analysis

virat kohli score probability using beta distribution

This post represents a real-world example of Binomial and Beta probability distribution from the sports field. In this post, you will learn about how the run scored by a Cricket player could be modeled using Binomial and Beta distribution. Ever wanted to predict the probability of Virat Kohli scoring a half-century in a particular match. This post will present a perspective on the same by using beta distribution to model the probability of runs that can be scored in a match. If you are a data scientist trying to understand beta and binomial distribution with a real-world example, this post will turn out to be helpful. First and foremost, let’s identify the random variable that we would like …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , , , .