Categories: Big Data

Data Science – Quick Start Guide for Machine Learning

machine learning

This article represents a very high-level information on different aspects of machine learning with an objective to present a quick-start read/guide for the data science beginners. One could grab one or more books on Machine Learning to learn the subject in detail. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.

Following are the key points described later in this article:

What is machine learning?
Key phases of machine learning
Prediction API model of machine learning

What is Machine Learning?

Simply speaking, Machine Learning is a set of artifical intelligence techniques which are used to solve one of the following problems based on the examples in hand:

Classification problems: Problems having “which type” as a question. For example, “which type” of email is this? (Spam or ham). In classification problem, one out of fixed number of answers are chosen.
Regression problems: Problems having “how much” as a question. For example, “how much” should be the price of a house in a given locality? Simply speaking, regression problems are related with numeric answer.

Machine learning is a key aspect of data science. It allows data scientist to apply existing data sets to one of the machine learning algorithm and predict based on that. In other words, a person wanting to become a data scientist must learn machine learning algorithms to be able to predict/recommend.

Key Phases of Machine Learning

Following are key steps in machine learning:

Training: Train the model
Prediction: Predict using the model, given an input data-set

Prediction API Model of Machine Learning

Above steps of machine learning could be represented using following, from API perspective. Thus, whether using R, or pyhton APIs, following is how the API structure would look like:

# Model created based on a given data set
model = createModelAPI(existingDataSet)
# Model is fed with new dataset, newDataSet which gives predicted output, predictedOutput
predictedOutput = createPredictionAPI( model, newDataSet )

Lets take an example of linear regression using R programming console. Look at the code below:

# Linear regression model
model = lm( price ~ carat, data=diamonds )
price = predict( model, newData )
# Multiple linear regression model
model = lm( price ~ carat + cut + color, data=diamonds )
price = predict( model, newData )

Author
Recent Posts

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin.
Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.