Data Science – Quick Start Guide for Machine Learning

This article represents a very high-level information on different aspects of machine learning with an objective to present a quick-start read/guide for the data science beginners. One could grab one or more books on Machine Learning to learn the subject in detail. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.
Following are the key points described later in this article:

  • What is machine learning?
  • Key phases of machine learning
  • Prediction API model of machine learning


What is Machine Learning?

Simply speaking, Machine Learning is a set of artifical intelligence techniques which are used to solve one of the following problems based on the examples in hand:

  • Classification problems: Problems having “which type” as a question. For example, “which type” of email is this? (Spam or ham). In classification problem, one out of fixed number of answers are chosen.
  • Regression problems: Problems having “how much” as a question. For example, “how much” should be the price of a house in a given locality? Simply speaking, regression problems are related with numeric answer.

Machine learning is a key aspect of data science. It allows data scientist to apply existing data sets to one of the machine learning algorithm and predict based on that. In other words, a person wanting to become a data scientist must learn machine learning algorithms to be able to predict/recommend.


Key Phases of Machine Learning

Following are key steps in machine learning:

  • Training: Train the model
  • Prediction: Predict using the model, given an input data-set


Prediction API Model of Machine Learning

Above steps of machine learning could be represented using following, from API perspective. Thus, whether using R, or pyhton APIs, following is how the API structure would look like:

# Model created based on a given data set
model = createModelAPI(existingDataSet)
# Model is fed with new dataset, newDataSet which gives predicted output, predictedOutput
predictedOutput = createPredictionAPI( model, newDataSet )

Lets take an example of linear regression using R programming console. Look at the code below:

# Linear regression model
model = lm( price ~ carat, data=diamonds )
price = predict( model, newData )
# Multiple linear regression model
model = lm( price ~ carat + cut + color, data=diamonds )
price = predict( model, newData )


Ajitesh Kumar

Ajitesh Kumar

Ajitesh has been recently working in the area of AI and machine learning. Currently, his research area includes Safe & Quality AI. In addition, he is also passionate about various different technologies including programming languages such as Java/JEE, Javascript and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data etc.

He has also authored the book, Building Web Apps with Spring 5 and Angular.
Ajitesh Kumar

Leave A Reply

Time limit is exhausted. Please reload the CAPTCHA.