Poisson Distribution Explained with Python Examples

0

In this post, you will learn about the concepts of Poisson probability distribution with Python examples. As a data scientist, you must get a good understanding of the concepts of probability distributions including normal, binomial, Poisson etc. 

Poisson distribution is the discrete probability distribution which represents the probability of occurrence of an event r number of times in a given interval of time or space if these events occur with a known constant mean rate and independent of each other. The following is the key criteria that the random variable follows the Poisson distribution.

  • Individual events occur at random and independently in a given interval. This can be an interval of time or space.
  • The mean number of occurrences of events in an interval (time or space) is finite and known. The mean number of occurrences is represented using \(\lambda\)

The random variable X represents the number of times that the event occurs in the given interval of time or space. If a random variable X follows Poisson distribution, it is represented as the following:

X ~ Po(\(\lambda\))

In the above expression, \(\lambda\) represents the mean number of occurrences in a given interval. Mathematically, the Poisson probability distribution can be represented using the following probability mass function:

\(\Large P(X=r) = \frac{e^{-\lambda}*\lambda^r}{r!}\)

.

In the above formula, the \(\lambda\) represents the mean number of occurrences, r represents different values of random variable X.

Expectation & Variance of Poisson Distribution

The expected value and variance of Poisson random variable is one and same and given by the following formula. \(\lambda\) is the mean number of occurrences in an interval (time or space)

\(\Large E(X) = \lambda\)

.

\(\Large Var(X) = \lambda\)

.

Poisson Distribution Explained with Real-world examples

Here are some real-world examples of Poisson distribution.

  • Poisson distribution for Space interval: Let’s say that you are out on a long drive. The rate of occurrences of good restaurants in a range of 10 miles (or km) is 2. In other words, the mean number of occurrences of restaurants in a range of 10 KM or miles is 2. What is the probability that 0, 1, 2, 3, 4, or 5 restaurants will occur in the next 10 km.

Here is how the Python code will look like, along with the plot for the Poisson probability distribution modeling the probability of the different number of restaurants ranging from 0 to 5 that one could find within 10 KM given the mean number of occurrences of the restaurant in 10 KM is 2. Scipy.stats Poisson class is used along with pmf method to calculate the value of probabilities.

from scipy.stats import poisson
import matplotlib.pyplot as plt
#
# Random variable representing number of restaurants
# Mean number of occurences of restaurants in 10 KM is 2
#
X = [0, 1, 2, 3, 4, 5]
lmbda = 2
#
# Probability values
#
poisson_pd = poisson.pmf(X, lmbda)
#
# Plot the probability distribution
#
fig, ax = plt.subplots(1, 1, figsize=(8, 6))
ax.plot(X, poisson_pd, 'bo', ms=8, label='poisson pmf')
plt.ylabel("Probability", fontsize="18")
plt.xlabel("X - No. of Restaurants", fontsize="18")
plt.title("Poisson Distribution - No. of Restaurants Vs Probability", fontsize="18")
ax.vlines(X, 0, poisson_pd, colors='b', lw=5, alpha=0.5)

Here is how the plot representing the Poisson probability distribution of number of restaurants occurring in the range of 10 kms would look like:

Poisson Probability Distribution (X = No. of Restaurants in 10 KM)
Fig 1. Poisson Probability Distribution (X = No. of Restaurants in 10 KM)
  • Poisson distribution for Time interval: Let’s say that the number of buses that come on a bus stop in span of 30 minutes is 1. Poisson distribution can be used to model the probability of different number of buses, X, coming to the bus stop within the next 30 minutes where X can take value of 0, 1, 2, 3, 4.

Here is how the Python code will look like, along with the plot for the Poisson probability distribution modeling the probability of different number of buses ranging from 0 to 4 that could arrive on the bus stop within 30 min given the mean number of occurrences of buses in 30 min interval is 1.

from scipy.stats import poisson
import matplotlib.pyplot as plt
#
# Random variable representing number of buses
# Mean number of buses coming to bus stop in 30 minutes is 1
#
X = [0, 1, 2, 3, 4]
lmbda = 1
#
# Probability values
#
poisson_pd = poisson.pmf(X, lmbda)
#
# Plot the probability distribution
#
fig, ax = plt.subplots(1, 1, figsize=(8, 6))
ax.plot(X, poisson_pd, 'bo', ms=8, label='poisson pmf')
plt.ylabel("Probability", fontsize="18")
plt.xlabel("X - No. of Buses", fontsize="18")
plt.title("Poisson Distribution - No. of Buses Vs Probability", fontsize="18")
ax.vlines(X, 0, poisson_pd, colors='b', lw=5, alpha=0.5)

Here is how the Poisson probability distribution plot would look like representing the probability of different number of buses coming to the bus stop in next 30 minutes given the mean number of buses that come within 30 min on that stop is 1.

Poisson Probability Distribution (X = No. of Buses in 30 Min)
Fig 2. Poisson Probability Distribution (X = No. of Buses in 30 Min)

Conclusions

Here is the summary of what you learned in this post in relation to Poisson probability distribution:

  • Poisson distribution is a discrete probability distribution.
  • The probability of occurrences of an event within an interval (time or space) is measured using Poisson distribution given that the individual events are independent of each other and the mean number of occurrences of the event in the interval is finite.
  • The expectation and variance of the random variable following Poisson distribution is the same as the mean number of occurrences of an event in the given interval (time or space)
Ajitesh Kumar
Follow me
Share.

Leave A Reply

Time limit is exhausted. Please reload the CAPTCHA.