Data-centric vs Model-centric AI: Concepts, Examples

Data centric vs model-centric AI

There is a lot of discussion around AI and which approach is better: model-centric or data-centric. In this blog post, we will explore both approaches and give examples of each. We will also discuss the benefits and drawbacks of each approach. By the end of this post, you will have a better understanding of both AI approaches and be able to decide which one is right for your business! As product managers and data science architects, you should be knowledgeable about both of these AI approaches so that you can make informed decisions about the products and services you build.

Model-centric approach to AI

Model-centric approach to AI is about having focus on using right set of machine learning algorithms, programming language and AI platform to build high quality machine learning models. This approach has resulted in great advancement in the field of machine learning / deep learning algorithms. The focus on building high performing models resulted in lot of AI / machine learning / deep learning frameworks using different programming languages such as Python, R, etc. Some of these popular frameworks include Python Sklearn, Tensorflow, Pytorch, etc. Apart from that, almost all cloud service providers came up with AI / ML services focused on building machine learning models. In addition, this led to lot of professionals taking up data sciences and machine learning as their career.

The picture below represents model-centric vs data-centric AI. In model-centric AI, the focus is to get the code (model) right while in data-centric AI, the focus shifts to data as shown in the picture below. 

Data centric vs model-centric AI

The following are some of the focus areas of model-centric AI:

  • Which ML algorithms to use for best generalization accuracy?
  • Whether one or multiple ML models to use?
  • Which ML tools & frameworks to use for building and monitoring the models’ performance?
  • Which programming language to use for building models?
  • What MLOps framework to use for deploying the models into performance?
  • What expertise and experience is needed for building and deploying the ML models into performance?

Some of the benefits of this approach are:

  • This approach can be used to solve most AI problems.
  • The focus on building better models has led to advances in AI technology.
  • This approach is well suited for businesses that have data and want to use AI to solve business problems.

Data-centric approach to AI

Data-centric approach to AI is about having focus on getting right kind of data which can be used to build high quality, high performance machine learning models. Unlike model-centric AI, the focus shifts to getting high quality data for training models rather than models. The following are some key aspects of high quality data set which can result in high quality machine learning / AI models:

  • Make the data labelling consistent
  • Remove ambiguous data samples
  • Identify and flush out noisy / outlier data set
  • Ensure that data set is free of missing values
  • Data set should be representative of actual data; The data set should have enough instances to cover all possible scenarios
  • Whether data maps to business levers
  • Focus on what data is needed rather than what volume of data is needed
  • Check for data bias and data leakage

Some of the benefits of data-centric AI are:

  • Greater model performance with new data owing to high-quality dataset in addition to data representative of actual data.
  • Data-centric AI is less resource intensive as it requires lesser data to achieve high performance. This approach is well suited for businesses that have data and want to use AI to solve business problems.

Some of the drawbacks of data-centric approach are:

  • The quality of data can be hard to control and manage.
  • Data sets can be biased if they do not represent the actual population.
  • This approach can be expensive as it requires a lot of data to train the models.

Model-centric and data-centric AI are two different approaches to AI. The model-centric approach is about having focus on using right set of machine learning algorithms, programming language and AI platform to build high quality machine learning models. This approach has resulted in great advancement in the field of machine learning / deep learning algorithms. Data-centric approach to AI is about having focus on getting right kind of data which can be used to build high quality, high performance machine learning models. Unlike model-centric AI, the focus shifts to getting high quality data for training models rather than models. While there are many different ways to approach AI, we believe that a hybrid or balanced approach of adopting model-centric and data-centric model is the most effective way to create intelligent machines. We’ve outlined the benefits of booth these approached of AI and how they can be used in business. If you have any questions about our services or would like more information, please let us know. We would be happy to discuss our approach further and answer any questions you may have.

Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.
Posted in AI, Data, Data analytics, Data Science, Machine Learning. Tagged with , , .