In this post, you will learn about the three most important challenges or guiding principles that could be used while you are building machine learning models.
The three key challenges which could be adopted while training machine learning models are following:
Before starting on working for training one or more machine learning models, one would need to decide whether one would like to go for simple model or one would want to focus on model accuracy. The simplicity of models could be achieved by using algorithms which help in building interpret-able models. These models are primarily called as data or statistical model. The example of such models are multiple regression models, logistic models, discriminant analysis models. When in the field of healthcare, data models would be most sought after as it helps in model explain-ability thereby ensuring higher simplicity. Model simplicity is also achieve by having lower dimensionality or lesser number of features.
When the focus is to achieve higher predictive accuracy, one would rather want to go for algorithmic models created using algorithms such as decision trees, random forest, neural networks. With algorithmic models, one loose the aspects of simplicity it is difficult to explain what went in to make a prediction.
It is of utmost importance to decide on how many features one would require to train the most optimal model. Large number of features with each of them containing some information that could be used for prediction could play important role in attaining high accuracy of the models. However, large number of features also increases the model complexity. In addition, the larger number of features may as well result in model over-fitting. On the other hand, having a very smaller number of features may result in model under-fitting. One would thus require to decide whether to have large number of features or smaller number but important features.
It is seen that the models with fewer parameters is less complex, and because of this, is preferred because it is likely to generalize better on average. Thus, it is key to use the most appropriate features to build the models.
There are different techniques to have the optimal dimensionality. They are some of the following:
There are several feature selection techniques which could be used to select the most important features.
The diagram below taken from this page displays different feature selection techniques which could be used:
While training the model on a given input data set, one may end up building several models / functions which has got comparable accuracy or error. The challenge then become as to which model to select. There are several techniques one could adopt in this relation:
Last updated: 09th May, 2024 In the world of generative AI models, autoencoders (AE) and…
Last updated: 7th May, 2024 Linear regression is a popular statistical method used to model…
Last updated: 3rd May, 2024 Have you ever wondered why some machine learning models perform…
Last updated: 2nd May, 2024 The success of machine learning models often depends on the…
When working on a machine learning project, one of the key challenges faced by data…
Last updated: 1st May, 2024 The bias-variance trade-off is a fundamental concept in machine learning…