Top 10 data science skills for product managers
In this post, you will learn about some of the top data science skills / concepts which may be required for product managers / business analyst to have, in order to create useful machine learning based solutions. Here are some of the topics / concepts which need to be understood well by product managers / business analysts in order to tackle day-to-day challenges while working with data science / machine learning teams. Knowing these concepts will help product managers / business analyst acquire enough skills in order to solve machine learning based problems.
Product managers / business analyst must understand the following terminologies in a clear manner without any ambiguity.
It is of utmost importance for product managers to identify the problems as machine learning problems. Here are thumb rules to determine whether the problem is a machine learning problem or otherwise:
Not necessarily. AI problems which applies machine learning techniques can be termed as machine learning problems. There can be problems which can be solved using large number of complex set of rules which requires computing rather than being solved by humans.
“Models” is the term which gets used frequently to represent the computing entity which serves the predictions. It must be clearly understood that “Models” are nothing but “Mathematical Models“. Machine learning is used to learn parametric or non-parametric models (mathematical models). Parametric models are those which require determining coefficients / parameters of mathematical models. For example, linear regression / logistic regression etc. Non-parametric models are based on machine learning algorithms which are not based on parameters. For example, decision tree, random forest etc. Many a times, models and algorithms are used interchangeably.
Training or fitting a model means using the past / historical data to train a machine learning model using different machine learning algorithms. For example training a linear regression model means using data to determine coefficients of linear regression algorithm.
“Features” and “Feature engineering” are most frequently used words when you will be dealing with data science team. Here is the summary of what they mean:
When the problem gets identified as machine learning problem, the first thing that product managers / business analyst should do is to define business metrics or key performance indicators (KPIs) which need to be evaluated at regular intervals to assess the performance of machine learning models. In my experience, I have seen that the metrics remain vague till the point the models get deployed in the system. And, this is one of the primary reasons that the business is unable to determine the ROI of machine learning deployments. And, thus, most likely, the projects get shelved.
Business metrics can be defined based on technical metrics or otherwise. For instance, if you are designing a worklist prioritisation system based on machine learning recommendation system scoring the work items in the worklist, business metrics can represent number of hours saved in analysis due to worklist prioritization. Alternatively, business metrics can also be based on technical metrics such as accuracy, prediction or recall of correctly classifying worklist items based on, say, business criticality.
Model once deployed in production starts serving predictions. However, it is required to be seen as how many predictions matched the actual values. This is what is termed as model monitoring. Model monitoring is about evaluating model performance based on pre-defined metrics at regular intervals.
In case the models are found to be not performing well, models is scheduled for retraining. Retraining models would mean some of the following:
Model tuning is nothing but tuning model hyper parameters. Model hyper parameters are meta data associated with the quality of models. For example, for some algorithms such as logistic regression, there are regularization related parameters. For non-parametric models such as those trained using random forest, hyper parameters are number of trees, maximum depth of tree etc.
The following are some of the most important terminologies which when understood can help product managers / business analyst sail through without much issues.
Last updated: 25th Jan, 2025 Have you ever wondered how to seamlessly integrate the vast…
Hey there! As I venture into building agentic MEAN apps with LangChain.js, I wanted to…
Software-as-a-Service (SaaS) providers have long relied on traditional chatbot solutions like AWS Lex and Google…
Retrieval-Augmented Generation (RAG) is an innovative generative AI method that combines retrieval-based search with large…
The combination of Retrieval-Augmented Generation (RAG) and powerful language models enables the development of sophisticated…
Have you ever wondered how to use OpenAI APIs to create custom chatbots? With advancements…