In this post, you will learn about the three most important challenges or guiding principles that could be used while you are building machine learning models.
The three key challenges which could be adopted while training machine learning models are following:
Before starting on working for training one or more machine learning models, one would need to decide whether one would like to go for simple model or one would want to focus on model accuracy. The simplicity of models could be achieved by using algorithms which help in building interpret-able models. These models are primarily called as data or statistical model. The example of such models are multiple regression models, logistic models, discriminant analysis models. When in the field of healthcare, data models would be most sought after as it helps in model explain-ability thereby ensuring higher simplicity. Model simplicity is also achieve by having lower dimensionality or lesser number of features.
When the focus is to achieve higher predictive accuracy, one would rather want to go for algorithmic models created using algorithms such as decision trees, random forest, neural networks. With algorithmic models, one loose the aspects of simplicity it is difficult to explain what went in to make a prediction.
It is of utmost importance to decide on how many features one would require to train the most optimal model. Large number of features with each of them containing some information that could be used for prediction could play important role in attaining high accuracy of the models. However, large number of features also increases the model complexity. In addition, the larger number of features may as well result in model over-fitting. On the other hand, having a very smaller number of features may result in model under-fitting. One would thus require to decide whether to have large number of features or smaller number but important features.
It is seen that the models with fewer parameters is less complex, and because of this, is preferred because it is likely to generalize better on average. Thus, it is key to use the most appropriate features to build the models.
There are different techniques to have the optimal dimensionality. They are some of the following:
There are several feature selection techniques which could be used to select the most important features.
The diagram below taken from this page displays different feature selection techniques which could be used:
While training the model on a given input data set, one may end up building several models / functions which has got comparable accuracy or error. The challenge then become as to which model to select. There are several techniques one could adopt in this relation:
Artificial Intelligence (AI) agents have started becoming an integral part of our lives. Imagine asking…
In the ever-evolving landscape of agentic AI workflows and applications, understanding and leveraging design patterns…
In this blog, I aim to provide a comprehensive list of valuable resources for learning…
Have you ever wondered how systems determine whether to grant or deny access, and how…
What revolutionary technologies and industries will define the future of business in 2025? As we…
For data scientists and machine learning researchers, 2024 has been a landmark year in AI…