In this post, you will learn about some of the key challenges in relation to achieving successful AI / ML projects implementation in a consistent and sustained manner. As AI / ML project stakeholders including senior management stakeholders, data science architects, product managers etc, you must get a good understanding of what would it take to successfully execute AI / ML projects and create value for the customers and the business. Either you are building AI / ML products or enabling unique models for your clients in SaaS setup, you will come across most of these challenges.
Here are some of the key challenges:
- Whether a machine learning solution is required?
- Business value metrics definition
- Data sourcing challenges
- Data management related challenges
- Limited data availability for building models
- Data security challenges
- Computing intensive feature engineering / processing
- Scalable platform for serving predictions
- Longer lead time for model deployments in production
- Support & education from different teams
Whether a machine learning solution is required & can be implemented?
The trickiest part of AI / ML projects is to identify whether a business problem can be solved using machine learning solutions. Many a times, the solution to the problem can be based on implementing a complex set of rules. In order to determine whether a problem has the solution based on machine learning and can be solved using machine learning, one needs to ensure the following:
- Define the machine learning tasks
- Understand what kind of data is required and whether data is available
- Define a performance metrics which can be used to evaluate models
Business Value Metrics Definition
Apart from model performance metrics, it is very / extremely important to identify the value metrics which can be used to evaluate the machine learning based solution. Value metrics refer to the business metrics. Many a time, models get deployed in production but in absence of value metrics, the model utility fails to get ascertained and hence this results in the failure of the project.
Data Sourcing Challenges
It is very important to determine the data sources (internal and external) which will be used to train the models. In SaaS setup, this data comes from the customer database. Given the fact that customers can have data stored in different formats, it becomes a challenge to gather right kind of data which can be used to train the models.
Data Management related Challenges
Given the constant need to manage and monitor data preparation / processing and data access from different teams working across different offices / geographic locations, from a security standpoint, the database (DB) and data science (DS) team along with data security team will need to collaborate at regular intervals to make sure data scientists have access to right data set in a secured manner. In this relation, DB team and DS team would need to collaborate at regular intervals for data gathering and preparation.
Limited Data Availability for Building Models
Given that a machine learning problem could cut across different product lines having different databases, DS team members working on a particular problem related to a particular product only get to access the data from the database related to that product. This limits the ability to come up with a great set of features which could span across different product databases. For example, the team working on product A problem do not get to study the data from product B or product C which could provide useful insights in analysing the problem and solution approaches. Similarly, team members working on product C problems do not get to see the related data (business domain) in other databases such as product A or product B. Given the fact that business functions many a times are interconnected, not having the data from different product databases would most likely result in sub-optimal models.
Data Security Challenges
Some of the key requirements related to building ML models is data preparation and exploratory data analysis. In this relation, data scientists working from different geographic locations would need easy and quick access to data while meeting the data security requirements. It may so happen that in one of the locations, the data scientists are juniors or interns who may not be given full data access due to different security reasons. Rather, they will be provided with selected data sets consisting of selected columns from selected tables in order to meet data security requirements. This throws up the challenges such as a need for data security, database and data science team to work on data preparation, data masking, assessing data security and making data available in desired format such as CSV. And, this requires regular intervention of this team due to the need to constantly assess data security requirements. From a business standpoint, this would not only impact data science team productivity but also delay the deployment of models in production leading to business impacts.
Computing intensive feature engineering / processing
Most of the times, data scientists use laptops for building the models. And, the laptops are having computing resources constraints vis-a-vis data / big data processing requirements. Thus, the data scientists across different locations will be constrained to work on only selected set of problems (building models) with only a limited set of features where data volume is not large enough. In case you have huge clients having big data, this is going to impact business sooner than later in terms of providing high quality AI-powered solutions in a timely manner.
Scalable Platform for Serving Predictions
Many a times, the ML platform is unable to support prediction requirements within a desired time period given computing intensive feature calculations which are required to be done during runtime. What is desired is the distributed calculations for feature processing for faster predictions. This is where ML platform needs to scale in a manner that it could handle big data requirements including running Spark workload for feature / data processing and serving predictions.
Longer Lead Time for Model Deployments
There is always a need for processing data of large volume. The fact that the DS team generally use their laptops for data processing and model building becomes a big constraint. This, in turn, leads to delays in relation to breaking data in chunks and then processing them on one’s laptop. Additionally, the need for training the model with large volume of data becomes a constraint due to lack of big data infrastructure. And, this results in delays related to building model and moving them in production.
AI/ML Support & Education (Different Teams)
Finally, AI / ML team can work in silo and create value for the business. The predictions need to be consumed by the products and thus, engineering tea has an important role to play in the integration. The models need to be deployed in the production by the IT team. Customer service team need to interact with the client in relation to predictions. Consultants may need to understand how the model predictions work in order to customize and deploy the solution.
In order to achieve above requirements in seamless manner, it becomes very important to provide high level education to different stakeholders including some of the following:
- Software engineering team including product managers
- IT team deploying AI / ML products
- Customer service team
- Consultants designing AI / ML solutions
- Sales & marketing team selling AI products / solutions