Category Archives: statistics

Fixed vs Random vs Mixed Effects Models – Examples

fixed and random effects models

In this post, you will learn about the concepts of fixed and random effects models along with when to use fixed effects models and when to go for fixed + random effects (mixed) models. The concepts will be explained with examples. As data scientists, you must get a good understanding of these concepts as it would help you build better linear models such as general linear mixed models or generalized linear mixed models (GLMM).  The following are some of the topics covered in this post: What are fixed, random & mixed effects models? When to use fixed effects vs mixed effects models? What are fixed, random & mixed effects models? First, we will take a real world …

Continue reading

Posted in Data Science, statistics. Tagged with .

Negative Binomial Distribution Python Examples

Negative Binomial Probability Distribution

In this post, you will learn about the concepts of negative binomial distribution explained using real-world examples and Python code. We will go over some of the following topics to understand negative binomial distribution: What is negative binomial distribution? What is difference between binomial and negative binomial distribution? Negative binomial distribution real-world examples Negative binomial distribution Python example What is Negative Binomial Distribution? Negative binomial distribution is a discrete probability distribution representing the probability of random variable, X, which is number of Bernoulli trials required to have r number of successes. This random variable is called as negative binomial random variable. And, the experiment representing X number of Bernoulli trials required to product r successes is called …

Continue reading

Posted in statistics. Tagged with .

Poisson Distribution Explained with Python Examples

In this post, you will learn about the concepts of Poisson probability distribution with Python examples. As a data scientist, you must get a good understanding of the concepts of probability distributions including normal, binomial, Poisson etc.  Poisson distribution is the discrete probability distribution which represents the probability of occurrence of an event r number of times in a given interval of time or space if these events occur with a known constant mean rate and independent of each other. The following is the key criteria that the random variable follows the Poisson distribution. Individual events occur at random and independently in a given interval. This can be an interval of time or …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Geometric Distribution Explained with Python Examples

In this post, you will learn about the concepts of Geometric probability distribution with the help of real-world examples and Python code examples. It is of utmost importance for data scientists to understand and get an intuition of different kinds of probability distribution including geometric distribution. You may want to check out some of my following posts on other probability distribution. Normal distribution explained with Python examples Binomial distribution explained with 10+ examples Hypergeometric distribution explained with 10+ examples In this post, the following topics have been covered: Geometric probability distribution concepts Geometric distribution python examples Geometric distribution real-world examples Geometric Probability Distribution Concepts Geometric probability distribution is a discrete …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Z-Score Explained with Ronaldo / Robert Example

In Champion’s league 2019-2020, here is the data related to their performance (ESPN.in). Player No. of Matches Played No. of Goals Scored Avg Goals / Matches Christiano Ronaldo 8 4 0.5 Robert Lewandowski 10 15 1.5 Table 1. Ronaldo / Robert performance in 2019-2020 Champion’s League . Well, the average goals / match indicates that Robert Lewandowski played much better than Christiano Ronaldo. However, can we conclude the same using statistical measures? How could we find out if they performed better than their own performance over last 7-8 years? This is where Z-Score comes into picture. In above evaluation, what is used to compare the performance is average goals / …

Continue reading

Posted in statistics. Tagged with , .

Python – How to Add Trend Line to Line Chart / Graph

Chris Gayle - Rohit Sharma - Dhoni - Virat Kohli IPL Batting Average Score Trendline

In this plot, you will learn about how to add trend line to the line chart / line graph using Python Matplotlib.As a data scientist, it proves to be helpful to learn the concepts and related Python code which can be used to draw or add the trend line to the line charts as it helps understand the trend and make decisions. In this post, we will consider an example of IPL average batting scores of Virat Kohli, Chris Gayle, MS Dhoni and Rohit Sharma of last 10 years, and, assess the trend related to their overall performance using trend lines. Let’s say that main reason why we want to …

Continue reading

Posted in Python, statistics. Tagged with , , .

Beta Distribution Explained with Python Examples

In this post, you will learn about Beta probability distribution with the help of Python examples. As a data scientist, it is very important to understand beta distribution as it is used very commonly as prior in Bayesian modeling. In this post, the following topics get covered: Beta distribution intuition and examples Introduction to beta distribution Beta distribution python examples Beta Distribution Intuition & Examples Beta distribution is widely used to model the prior beliefs or probability distribution in real world applications. Here is a great article on understanding beta distribution with an example of baseball game. You may want to pay attention to the fact that even if the baseball …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Bernoulli Distribution Explained with Python Examples

In this post, you will learn about the concepts of Bernoulli Distribution along with real-world examples and Python code samples. As a data scientist, it is very important to understand statistical concepts around various different probability distributions to understand the data distribution in a better manner. In this post, the following topics will get covered: Introduction to Bernoulli distribution Bernoulli distribution real-world examples Bernoulli distribution python code examples Introduction to Bernoulli Distribution Bernoulli distribution is a discrete probability distribution representing the discrete probabilities of a random variable which can take only one of the two possible values such as 1 or 0, yes or no, true or false etc. The probability of …

Continue reading

Posted in Data Science, statistics. Tagged with , .

Bayes Theorem Explained with Examples

In this post, you will learn about Bayes’ Theorem with the help of examples. It is of utmost importance to get a good understanding of Bayes Theorem in order to create probabilistic models. Bayes’ theorem is alternatively called as Bayes’ rule or Bayes’ law. One of the many applications of Bayes’s theorem is Bayesian inference which is one of the approaches of statistical inference (other being Frequentist inference), and fundamental to Bayesian statistics. In this post, you will learn about the following: Introduction to Bayes’ Theorem Bayes’ theorem real-world examples Introduction to Bayes’ Theorem In simple words, Bayes Theorem is used to determine the probability of a hypothesis in the presence of more evidence or information. In other …

Continue reading

Posted in Bayesian, Data Science, statistics. Tagged with .

Joint & Conditional Probability Explained with Examples

In this post, you will learn about joint and conditional probability differences and examples. When starting with Bayesian analytics, it is very important to have a good understanding around probability concepts. And, the probability concepts such as joint and conditional probability is fundamental to probability and key to Bayesian modeling in machine learning. As a data scientist, you must get a good understanding of probability related concepts. Joint & Conditional Probability Concepts In this section, you will learn about basic concepts in relation to Joint and conditional probability. Probability of an event can be quantified as a function of uncertainty of whether that event will occur or not. Let’s say an event A is …

Continue reading

Posted in Bayesian, Data Science, statistics. Tagged with , .

What, When & How of Scatterplot Matrix in Python

In this post, you will learn about some of the following in relation to scatterplot matrix. Note that scatter plot matrix can also be termed as pairplot. Later in this post, you would find Python code example in relation to using scatterplot matrix / pairplot (seaborn package). What is scatterplot matrix? When to use scatterplot matrix / pairplot? How to use scatterplot matrix in Python? What is Scatterplot Matrix? Scatter plot matrix is a matrix (or grid) of scatter plots where each scatter plot in the grid is created between different combinations of variables. In other words, scatter plot matrix represents bi-variate or pairwise relationship between different combinations of variables …

Continue reading

Posted in Data Science, Python, statistics. Tagged with , , , .

Standard Deviation of Population & Sample – Python

In this post, you will learn about the statistics concepts of standard deviation with the help of Python code example. The following topics are covered in this post: What is Standard deviation? Different techniques for calculating standard deviation Standard deviation of population vs sample What is Standard Deviation? The Standard Deviation (SD) of a data set is a measure of how spread out the data is. Take a look at the following example using two different samples of 4 numbers whose mean are same but the standard deviation (data spread) are different. Here is the code for calculating the mean of the above sample. One can either write Python code …

Continue reading

Posted in Data Science, Python, statistics. Tagged with , , .

Hypergeometric Distribution Explained with 10+ Examples

In this post, we will learn Hypergeometric distribution with 10+ examples. The following topics will be covered in this post: What is Hypergeometric Distribution? 10+ Examples of Hypergeometric Distribution If you are an aspiring data scientist looking forward to learning/understand the binomial distribution in a better manner, this post might be very helpful. The Binomial distribution can be considered as a very good approximation of the hypergeometric distribution as long as the sample consists of 5% or less of the population. One would need a good understanding of binomial distribution in order to understand the hypergeometric distribution in a great manner. I would recommend you take a look at some of my related posts on …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , .

Binomial Distribution with Python Code Examples

sample-binomial-distribution-plot

In this code, you will learn code examples, written with Python Numpy package, related to the binomial distribution. You may want to check out the post, Binomial Distribution explained with 10+ examples to get an understanding of Binomial distribution with the help of several examples. All of the examples could be tried with code samples given in this post. Here are the instructions: Load the Numpy package: First and foremost, load the Numpy and Seaborn library Code Syntax – np.random.binomial(n, p, size=1): The code np.random.binomial(n, p, size=1) will be used to print the number of successes that will happen in one (size=1) experiment comprising of n number of trials with probability/proportion of success being p. Tossing a …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , .

Binomial Distribution Explained with 10+ Examples

binomial experiment coin tossing 100 experiments 50 trials

In this post, we will learn binomial distribution with 10+ examples. The following topics will be covered in this post: What is Binomial Distribution? Binomial distribution python example 10+ Examples of Binomial Distribution If you are an aspiring data scientist looking forward to learning/understand the binomial distribution in a better manner, this post might be very helpful. What is a Binomial Distribution? The binomial distribution is a discrete probability distribution that represents the probabilities of binomial random variables in a binomial experiment. What is an Experiment? An experiment is nothing but a set of one or more repeated trials resulting in a particular outcome out of many outcomes. Thus, an experiment could consist of 1 …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , , .

Beta Distribution Example for Cricket Score Analysis

virat kohli score probability using beta distribution

This post represents a real-world example of Binomial and Beta probability distribution from the sports field. In this post, you will learn about how the run scored by a Cricket player could be modeled using Binomial and Beta distribution. Ever wanted to predict the probability of Virat Kohli scoring a half-century in a particular match. This post will present a perspective on the same by using beta distribution to model the probability of runs that can be scored in a match. If you are a data scientist trying to understand beta and binomial distribution with a real-world example, this post will turn out to be helpful. First and foremost, let’s identify the random variable that we would like …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , , , .