8 Most Common Machine Learning Tasks

0
This article represents some of the most common machine learning tasks that one may come across while trying to solve a machine learning problem. Under each task are also listed a set of machine learning methods that could be used to resolve these tasks. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos.

Following are the key machine learning tasks briefed later in this article:

  • Feature selection
  • Regression
  • Classification
  • Clustering
  • Multivariate querying
  • Density estimation
  • Dimension reduction
  • Testing and matching



Following are top 8 most common machine learning tasks that one could come across most frequently while solving an advanced analytics problem:

  1. Feature Selection: Feature selection is one of the critical tasks which would be used when building machine learning models. Feature selection is important because selecting right features would not only help build models of higher accuracy but also help achieve objectives related to building simpler models, reduce overfitting etc. The following are some of the techniques which could be used for feature selection:
    • Filter methods which helps in selecting features based on the outcomes of statistical tests. The following are some of the statistical tests which are used:
      • Pearson’s correlation
      • Linear discriminant analysis (LDA)
      • Analysis of Variance (ANOVA)
      • Chi-square tests
    • Wrapper methods which helps in feature selection by using a subset of features and determining the model accuracy. The following are some of the algorithms used:
      • Forward selection
      • Backward elimination
      • Recursive feature elimination
    • Regularization techniques which penalizes one or more features appropriately to come up with most important features. The following are some of the algorithms used:
      • LASSO (L1) regularization
      • Ridge (L2) regularization
  2. Regression: Regression tasks mainly deal with estimation of numerical values (continuous variables). Some of the examples include estimation of housing price, product price, stock price etc. Some of the following ML methods could be used for solving regressions problems:
    • Kernel regression (Higher accuracy)
    • Gaussian process regression (Higher accuracy)
    • Regression trees
    • Linear regression
    • Support vector regression
    • LASSO
  3. Classification: Classification tasks is simply related with predicting a category of a data (discrete variables). One of the most common example is predicting whether or not an email if spam or ham. Some of the common use cases could be found in the area of healthcare such as whether a person is suffering from a particular disease or not. It also has its application in financial use cases such as determining whether a transaction is fraud or not. The ML methods such as following could be applied to solve classification tasks:
    • Kernel discriminant analysis (Higher accuracy)
    • K-Nearest Neighbors (Higher accuracy)
    • Artificial neural networks (ANN) (Higher accuracy)
    • Support vector machine (SVM) (Higher accuracy)
    • Random forests (Higher accuracy)
    • Decision trees
    • Boosted trees
    • Logistic regression
    • naive Bayes
    • Deep learning
  4. Clustering: Clustering tasks are all about finding natural groupings of data and a label associated with each of these groupings (clusters). Some of the common example includes customer segmentation, product features identification for product roadmap. Some of the following are common ML methods:
    • Mean-shift  (Higher accuracy)
    • Hierarchical clustering
    • K-means
    • Topic models
  5. Multivariate querying: Multivariate querying is about querying or finding similar objects. Some of the following ML methods could be used for such problems:
    • Nearest neighbors
    • Range search
    • Farthest neighbors
  6. Density estimation: Density estimation problems are related with finding likelihood or frequency of objects. In probability and statistics, density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. Some of the following ML methods could be used for solving density estimation tasks:
    • Kernel density estimation (Higher accuracy)
    • Mixture of Gaussians
    • Density estimation tree
  7. Dimension reduction: As per Wikipedia page on Dimension reduction , Dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction. Following are some of ML methods that could be used for dimension reduction:
    • Manifold learning/KPCA (Higher accuracy)
    • Principal component analysis
    • Independent component analysis
    • Gaussian graphical models
    • Non-negative matrix factorization
    • Compressed sensing
  8. Testing and matching: Testing and matching tasks relates to comparing data sets. Following are some of the methods that could be used for such kind of problems:
    • Minimum spanning tree
    • Bipartite cross-matching
    • N-point correlation



Ajitesh Kumar

Ajitesh Kumar

Ajitesh has been recently working in the area of AI and machine learning. Currently, his research area includes Safe & Quality AI. In addition, he is also passionate about various different technologies including programming languages such as Java/JEE, Javascript and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data etc.

He has also authored the book, Building Web Apps with Spring 5 and Angular.
Ajitesh Kumar

Leave A Reply

Time limit is exhausted. Please reload the CAPTCHA.