Hidden Markov Models Explained with Examples

hidden markov model

Hidden Markov models (HMMs) are a type of statistical modeling that has been used for several years. They have been applied in different fields such as medicine, computer science, and data science. The Hidden Markov model (HMM) is the foundation of many modern-day data science algorithms. It has been used in data science to make efficient use of observations for successful predictions or decision-making processes. This blog post will cover hidden Markov models with real-world examples and important concepts related to hidden Markov models.

What are Markov Models?

Markov models are statistical models that are used to predict the next state based on the current hidden or observed states. Markov model is a finite-state machine where each state has an associated probability of being in any other state after one step. Markov models can be used to model real-world problems where hidden and observable states are involved. Markov models can be classified into hidden and observable based on the type of information available to use for making predictions or decisions. Hidden Markov models deal with hidden variables that cannot be directly observed but only inferred from other observations, whereas in an observable model also termed as Markov chain, hidden variables are not involved.

What is Markov Chain?

Markov chain is a sequence of random variables. Markov chain is a type of random process that changes its state or form of randomness based on certain probabilities. Markov chains are useful in understanding Markov models, and, in particular, hidden Markov models used for data science applications. Markov chain is the most trivial type of Markov model where all states are known. There are no hidden states.

A Markov chain has a short-term memory. It can only depend on the previous hidden state(s) to determine its current hidden state. It only remembers your current state and the state you will transition to next.

A Markov chain consists of three important components:

  • Initial probability distribution: An initial probability distribution over states, πi is the probability that the Markov chain will start in a certain state i. Some states j may have πj = 0, meaning that they cannot be initial states
  • One or more states
  • Transition probability distribution: A transition probability matrix A where each \(a_{ij}\) represents the probability of moving from state i to state j

The diagram below represents a Markov chain where there are three states representing wheather of the day (cloudy, rainy and sunny). And, there are transitiion probabilities representing the weather of the next day given the weather of current day.

Markov chain

What are Hidden Markov models (HMM)?

The hidden Markov model (HMM) is another type of Markov model where there are few states which are hidden. This is where HMM differs from a Markov chain. HMM is a statistical model in which the system being modeled are Markov processes with unobserved or hidden states. It is a hidden variable model which can give an observation of another hidden state with the help of the Markov assumption. The hidden state is the term given to the next possible variable which cannot be directly observed but can be inferred by observing one or more states according to Markov assumption. Markov assumption is the assumption that a hidden variable is dependent only on the previous hidden state. Mathematically, the probability of being in a state at a time t depends only on the state at the time (t-1). It is termed a limited horizon assumption. Another Markov assumption states that the conditional distribution over the next state, given the current state, doesn’t change over time. This is also termed a stationary process assumption.

A Markov model is made up of two components: the state transition and hidden random variables that are conditioned on each other. A hidden Markov model consists of five important components:

  • Initial probability distribution: An initial probability distribution over states, πi is the probability that the Markov chain will start in state i. Some states j may have πj = 0, meaning that they cannot be initial states. The initialization distribution defines each hidden variable in its initial condition at time t=0 (the initial hidden state).
  • One or more hidden states
  • Transition probability distribution: A transition probability matrix where each \(a_{ij}\) representing the probability of moving from state i to state j. The transition matrix is used to show the hidden state to hidden state transition probabilities.
  • A sequence of observations
  • Emission probabilities: A sequence of observation likelihoods, also called emission probabilities, each expressing the probability of an observation \(o_{i}\) being generated from a state I. The emission probability is used to define the hidden variable in terms of its next hidden state. It represents the conditional distribution over an observable output for each hidden state at time t=0.

Let’s understand the above using the hidden Markov model representation shown below:

hidden markov model

The hidden Markov model in the above diagram represents the process of predicting whether someone will found to be walking, shopping or cleaning a particular day depending upon on whether the day is rainy or sunny. The following represents five components of hidden Markov model in the above diagram:

hidden markov model components 

Lets notice some of the following in the above picture:

  • There are two hidden states such as rainy and sunny. These states are hidden because what is observed as the process output is whether the person is shopping, walking or cleaning.
  • The sequence of observation is shop, walk and clean.
  • Initial probability distribution is represented by start probability
  • Transition probability represents the transition of one state (rainy or sunny) to other state given the current state
  • Emission probability represents probability of observing the output, shop, clean and walk given the states, rainy or sunny.

The Hidden Markov model is a special type of Bayesian network that has hidden variables which are discrete random variables. The first-order hidden Markov model allows hidden variables to have only one state and the second-order hidden Markov models allow hidden states to be having two or more than two hidden states.

The hidden Markov model represents two different states of variables: Hidden state and observable state. A hidden state is one that cannot be directly observed or seen. An observable state is one that can be observed or seen. One hidden state can be associated with many observable states and one observable state may have more than hidden states. The hidden Markov model uses the concept of probability to identify whether there will be an emission from the hidden state to another hidden state or from hidden states to observable states.

Real-world examples of Hidden Markov Models (HMM)

Here are a few real-world examples where the hidden Markov models are used:

  • Retail scenario: Now if you go to the grocery store once per week, it is relatively easy for a computer program to predict exactly when your shopping trip will take more time. The hidden Markov model calculates which day of visiting takes longer compared with other days and then uses that information in order to determine why some visits are taking long while others do not seem too problematic for shoppers like yourself. Another example from e-commerce where hidden Markov models are used is the recommendation engine. The hidden Markov models try to predict the next item that you would like to buy.
  • Travel scenario: By using hidden Markov models, airlines can predict how long it will take a person to finish checking out from an airport. This allows them to know when they should start boarding passengers!
  • Medical Scenario: The hidden Markov models are used in various medical applications, where it tries to find out the hidden states of a human body system or organ. For example, cancer detection can be done by analyzing certain sequences and determining how dangerous they might pose for the patient. Another example where hidden Markov models get used is for evaluating biological data such as RNA-Seq, ChIP-Seq, etc., that help researchers understand gene regulation. Using the hidden Markov model, doctors can predict the life expectancy of people based on their age, weight, height, and body type.
  • Marketing scenario: As marketers utilize a hidden Markov model, they can understand at what stage of their marketing funnel users are dropping off and how to improve user conversion rates.

What are some libraries which can be used for training hidden Markov models?

  • One of the popular hidden Markov model libraries is PyTorch-HMM, which can also be used to train hidden Markov models. The library is written in Python and it can be installed using PIP.
  • HMMlearn: Hidden Markov models in Python
  • PyHMM: PyHMM is a hidden Markov model library for Python.
  • DeepHMM: A PyTorch implementation of a Deep Hidden Markov Model
  • HiddenMarkovModels.jl
  • HMMBase.jl
Ajitesh Kumar
Follow me

Ajitesh Kumar

I have been recently working in the area of Data Science and Machine Learning / Deep Learning. In addition, I am also passionate about various different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia etc and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data etc. I would love to connect with you on Linkedin.
Posted in Data Science, Python. Tagged with .

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.