Hidden Markov models (HMMs) are a type of statistical modeling that has been used for several years. They have been applied in different fields such as medicine, computer science, and data science. The Hidden Markov model (HMM) is the foundation of many modern-day data science algorithms. It has been used in data science to make efficient use of observations for successful predictions or decision-making processes. This blog post will cover hidden Markov models with real-world examples and important concepts related to hidden Markov models.
Markov models are named after Andrey Markov, who first developed them in the early 1900s. Markov models are a type of probabilistic model that is used to predict the future state of a system, based on its current state. In other words, Markov models are used to predict the future state based on the current hidden or observed states. Markov model is a finite-state machine where each state has an associated probability of being in any other state after one step. They can be used to model real-world problems where hidden and observable states are involved. Markov models can be classified into hidden and observable based on the type of information available to use for making predictions or decisions. Hidden Markov models deal with hidden variables that cannot be directly observed but only inferred from other observations, whereas in an observable model also termed as Markov chain, hidden variables are not involved.
To better understand Markov models, let’s look at an example. Say you have a bag of marbles that contains four marbles: two red marbles and two blue marbles. You randomly select a marble from the bag, note its color, and then put it back in the bag. After repeating this process several times, you begin to notice a pattern: The probability of selecting a red marble is always two out of four, or 50%. This is because the probability of selecting a particular color of marble is determined by the number of that color of marble in the bag. In other words, the past history (i.e., the contents of the bag) determines the future state (i.e., the probability of selecting a particular color of marble).
This example illustrates the concept of a Markov model: the future state of a system is determined by its current state and past history. In the case of the bag of marbles, the current state is determined by the number of each color of marble in the bag. The past history is represented by the contents of the bag, which determine the probabilities of selecting each color of marble.
Markov models have many applications in the real world, including predicting the weather, stock market prices, and the spread of disease. Markov models are also used in natural language processing applications such as speech recognition and machine translation. In speech recognition, Markov models are used to identify the correct word or phrase based on the context of the sentence. In machine translation, Markov models are used to select the best translation for a sentence based on the translation choices made for previous sentences in the text.
Markov chains, named after Andrey Markov, can be thought of as a machine or a system that hops from one state to another, typically forming a chain. Markov chains have the Markov property, which states that the probability of moving to any particular state next depends only on the current state and not on the previous states.
A Markov chain consists of three important components:
The diagram below represents a Markov chain where there are three states representing the weather of the day (cloudy, rainy, and sunny). And, there are transition probabilities representing the weather of the next day given the weather of the current day.
There are three different states such as cloudy, rain, and sunny. The following represent the transition probabilities based on the above diagram:
Using this Markov chain, what is the probability that the Wednesday will be cloudy if today is sunny. The following are different transitions that can result in a cloudy Wednesday given today (Monday) is sunny.
The total probability of a cloudy Wednesday = 0.2 + 0.03 + 0.04 = 0.27.
As shown above, the Markov chain is a process with a known finite number of states in which the probability of being in a particular state is determined only by the previous state.
The hidden Markov model (HMM) is another type of Markov model where there are few states which are hidden. This is where HMM differs from a Markov chain. HMM is a statistical model in which the system being modeled are Markov processes with unobserved or hidden states. It is a hidden variable model which can give an observation of another hidden state with the help of the Markov assumption. The hidden state is the term given to the next possible variable which cannot be directly observed but can be inferred by observing one or more states according to Markov’s assumption. Markov assumption is the assumption that a hidden variable is dependent only on the previous hidden state. Mathematically, the probability of being in a state at a time t depends only on the state at the time (t-1). It is termed a limited horizon assumption. Another Markov assumption states that the conditional distribution over the next state, given the current state, doesn’t change over time. This is also termed a stationary process assumption.
A Markov model is made up of two components: the state transition and hidden random variables that are conditioned on each other. However, A hidden Markov model consists of five important components:
Let’s understand the above using the hidden Markov model representation shown below:
The hidden Markov model in the above diagram represents the process of predicting whether someone will be found to be walking, shopping, or cleaning on a particular day depending upon whether the day is rainy or sunny. The following represents five components of the hidden Markov model in the above diagram:
Let’s notice some of the following in the above picture:
The Hidden Markov model is a special type of Bayesian network that has hidden variables which are discrete random variables. The first-order hidden Markov model allows hidden variables to have only one state and the second-order hidden Markov models allow hidden states to be having two or more two hidden states.
The hidden Markov model represents two different states of variables: Hidden state and observable state. A hidden state is one that cannot be directly observed or seen. An observable state is one that can be observed or seen. One hidden state can be associated with many observable states and one observable state may have more than hidden states. The hidden Markov model uses the concept of probability to identify whether there will be an emission from the hidden state to another hidden state or from hidden states to observable states.
Here are a few real-world examples where the hidden Markov models are used:
Here are some great tutorials I could find on Youtube. Pls feel free to suggest any other tutorials you have come across.
In conclusion, this blog has explored what a Markov Model is, what Hidden Markov Models are, and some of their real-world applications. It is important to have an understanding of these topics if one wants to use them in a data science project. With the increasing complexity of datasets, the use of these models can provide invaluable insights into data correlations and trends.
Artificial Intelligence (AI) agents have started becoming an integral part of our lives. Imagine asking…
In the ever-evolving landscape of agentic AI workflows and applications, understanding and leveraging design patterns…
In this blog, I aim to provide a comprehensive list of valuable resources for learning…
Have you ever wondered how systems determine whether to grant or deny access, and how…
What revolutionary technologies and industries will define the future of business in 2025? As we…
For data scientists and machine learning researchers, 2024 has been a landmark year in AI…
View Comments
Loved this Ajitesh! Really interesting and concise stuff. Thanks for uploading this!
Thank you for making it so easy to understand!
Dear sir, I'm a PhD students in Nigeria. I'm working on face recognition and extraction. I s there a way I can use HMM in my research. I'm new to HMM but read several articles before coming across your article recently. I need help. Thank you sir
My God!
What would be ur hidden variables n observed variables in this instance ?
Use deep learning..convolutional neural netwks for face recogn ition...it's easier n much more straight forward.
Read prof bharatendra Rai tutorials in YouTube on the same.
clear, to the point explanation with examples.
Thanks very much.
You makes my day bright
I need help but I don't know whom to contact. I would like to use HMM for dynamic gesture detection. I am trying to use it in a way that I would capture the movement of object and detect it using HMM .