Data Science

Generative vs Discriminative Models: Examples

The field of machine learning is rapidly evolving, and with it, the concepts and techniques that are used to develop models that can learn from data. Among these concepts, generative and discriminative models are two widely used approaches in the field. Generative models learn the joint probability distribution of the input features and output labels, whereas discriminative models learn the conditional probability distribution of the output labels given the input features. While both models have their strengths and weaknesses, understanding the differences between them is crucial to developing effective machine learning systems.

Real-world problems such as speech recognition, natural language processing, and computer vision, require complex solutions that are able to handle vast amounts of data, and accurately predict outcomes. These problems can be solved using both generative and discriminative models, each with its own advantages and disadvantages.

In this blog, we will delve deeper into generative and discriminative models, explain the differences between them, and provide examples of real-world problems that can be solved using each approach. So, whether you are new to machine learning or an experienced practitioner, this blog will help you understand the importance of these models, and how they can be used to tackle complex problems.

What are Generative Models?

Generative models are a type of machine learning models that is used to generate new data samples based on a training set. For example, a generative model could be trained on a dataset of pictures of cats, and then used to generate new cat pictures. Or, a generative model trained on images of faces could be used to generate new images of faces that look realistic but are not necessarily identical to any of the training images. The following picture represents how generative modeling works. Training data set consists of different images of horses. The model is trained to capture complex relationship between different pixels in the horses’ images. The sample (“an observation” in the image below) from the training data is used to then create different new images of the horses shown in “Generated samples”.

A generative model must generate different variation of the desired outputs. And, for that reason, the generative models must be probabilistic in nature rather than being deterministic which will result in same output. For example, taking the average value of pixels in the training dataset won’t work. A generative model needs to have a random part that affects each output differently.

There are two main types of generative models: Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). GANs consists of two neural networks, a generator and a discriminator, that compete with each other in order to generate realistic data. VAEs consist of an encoder and a decoder, which work together to compress data into latent variables and then generate new data from these latent variables.

There are several benefits to using generative models. The following are some of them:

  • One is that they can help us understand complex data sets. For example, if we want to know how a particular species of bird behaves, we can use a generative model to generate data about how that bird behaves. This can help us learn more about the species and how it interacts with its environment.
  • Another benefit of generative models is that they can help us create new data. For example, if we need to create a set of data for testing purposes, we can use a generative model to generate the data. This can save us time and effort, and it can also help us ensure that our tests are accurate.
  • Finally, generative models can be used to improve machine learning algorithms. By using a generative model to generate data, we can train our machine learning algorithms in a more realistic way. This can lead to better performance and more accurate results.

Generative models are often used for data augmentation, as they can help to improve the performance of machine learning models by providing more training data. Generative models can also be used for unsupervised learning, as they can learn the underlying distribution of the data. They are typically used for tasks such as image synthesis, voice synthesis, and natural language processing. 

There are a few sources of information on generative models that you could explore. The first is the Wikipedia page on the topic, generative models, which provides a good high-level overview of the concepts involved. If you want to go deeper, there are a number of textbooks on the subject. Finally, there are a number of online courses that can teach you about generative models in more detail.

What are Discriminative Models?

Discriminative models are a type of machine learning model that can be used to predict labels or classifications. For example, a discriminative model could be used to predict whether or not an email is spam. They are a type of machine learning models that is used to predict a target variable based on a set of input features. These models learn the relationship between the input features and the target variable, and then use that relationship to make predictions. Discriminative models are often used for classification tasks, where the goal is to predict which class a instance belongs to. In general, discriminative models are better suited for classification tasks, while generative models are better suited for density estimation and unsupervised learning tasks. The picture below represents how discriminative model works in case of classification. The picture is labelled as 1 if the painting is done by Van Gogh and 0 otherwise. Later, a painting is passed to the discriminative model and the probability is predicted. 

Discriminative models work by learning a decision boundary that can best separate the training data points into their respective classes. They learn the relationship between the input features and the target labels. In other words, they learn how to map the input data to the correct label.  Once the decision boundary has been learned, the model can then be used to predict the class label for new data points.

There are different types of discriminative models, including logistic regression, support vector machines, and decision trees. Each type of model has its own strengths and weaknesses, so it is important to choose the right model for the task at hand. Logistic regression is typically used for binary classification tasks, while support vector machines are better suited for more complex tasks. Decision trees can be used for both classification and regression tasks. Thus, it is important to understand the different types of discriminative models in order to choose the right one for the task at hand.

The following are some of the examples of discriminative models:

  • Logistic regression
  • Support vector machines (SVM)
  • Linear discriminant analysis
  • Decision trees
  • Random forest Classifiers

Difference between Generative & Discriminative Models

Discriminative models and generative models are two different types of machine learning models.

  • Discriminative models are used to predict the probability of a certain class label, given an input. Generative models, on the other hand, are used to generate new data samples that are similar to the training data. In other words, discriminative models focus on predicting labels, while generative models focus on modeling the distribution of data.
  • Mathematically, discriminative model estimates P(Y|X). In other words, discriminative modeling aims to model the probability of a label Y given some observation X. On the other hand, generative model estimates P(X). In other words, generative modeling aims to model the probability of observing an observation X. Sampling from this distribution allows us to generate new observations.
  • Discriminative models learn about the relationship between inputs and outputs, while generative models also need to learn about the distribution of data. As a result, discriminative models tend to be more accurate than generative models. However, generative models have the advantage of being able to generate new data samples, which can be useful for tasks such as data augmentation.
  • Generative models are often easier to train, but they can be less accurate than discriminative models. The choice of model depends on the application and the type of data.
  • Discriminative models directly model the dependence of the label on the input features. In contrast, generative models first model the joint distribution of the input features and the label, and then use this joint distribution to infer the label for new data points.

Conclusion

In conclusion, it’s important to understand the difference between generative and discriminative models when working in machine learning. Generative models are used to generate data, while discriminative models are used to discriminate between different classes. Both types of models have their own uses and applications. Thanks for reading!

Latest posts by Ajitesh Kumar (see all)
Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Recent Posts

What are AI Agents? How do they work?

Artificial Intelligence (AI) agents have started becoming an integral part of our lives. Imagine asking…

2 weeks ago

Agentic AI Design Patterns Examples

In the ever-evolving landscape of agentic AI workflows and applications, understanding and leveraging design patterns…

2 weeks ago

List of Agentic AI Resources, Papers, Courses

In this blog, I aim to provide a comprehensive list of valuable resources for learning…

2 weeks ago

Understanding FAR, FRR, and EER in Auth Systems

Have you ever wondered how systems determine whether to grant or deny access, and how…

3 weeks ago

Top 10 Gartner Technology Trends for 2025

What revolutionary technologies and industries will define the future of business in 2025? As we…

3 weeks ago

OpenAI GPT Models in 2024: What’s in it for Data Scientists

For data scientists and machine learning researchers, 2024 has been a landmark year in AI…

3 weeks ago