Free Datasets for Machine Learning & Deep Learning

dataset publicly_available free machine learning

Here is the list of free data sets for machine learning & deep learning publicly available: Machine learning problems datasets UC Irvine Machine Learning Repository: A repository of 560 datasets suitable for traditional machine learning algorithm problems such as classification and regression Public available dataset through public APIs: A list of 650+ datasets available via public API Penn machine learning dataset: The data sets cover a broad range of applications, and include binary/multi-class classification problems and regression problems, as well as combinations of categorical, ordinal, and continuous features. The good part if that the datasets is available in tabular form that makes it very useful for training models with traditional …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

Actionable Insights Examples – Turning Data into Action

data to insights to action - actionable insights examples

In this post, you will learn about how to turn data into information and then to actionable insights with the help of few examples. It will be helpful for data analysts, data scientists, and business analysts to get a good understanding of what is actionable insight? You will understand aspects related to data-driven decision making. Before getting into the details, let’s understand what is the problem at hand? The school authority is trying to assess and improve the health of students. Here is the question it is dealing with: How could we improve the overall health of the students in the school? We will look into the approach of finding the …

Continue reading

Posted in Analytics, Data Science. Tagged with , , .

When to use Deep Learning vs Machine Learning Models?

In this post, you will learn about when to go for training deep learning models from the perspective of model performance and volume of data. As a machine learning engineer or data scientist, it always bothers as to can we use deep learning models in place of traditional machine learning models trained using algorithms such as logistic regression, SVM, tree-based algorithms, etc. The objective of this post is to provide you with perspectives on when to go for traditional machine learning models vs deep learning models.  The two key criteria based on which one can decide whether to go for deep learning vs traditional machine learning models are the following: …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , , .

Most Common Types of Machine Learning Problems

In this post, you will learn about the most common types of machine learning (ML) problems along with a few examples. Without further ado, let’s look at these problem types and understand the details. Regression Classification Clustering Time-series forecasting Anomaly detection Ranking Recommendation Data generation Optimization Problem types Details Algorithms Regression When the need is to predict numerical values, such kinds of problems are called regression problems. For example, house price prediction Linear regression, K-NN, random forest, neural networks Classification When there is a need to classify the data in different classes, it is called a classification problem. If there are two classes, it is called a binary classification problem. …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Historical Dates & Timeline for Deep Learning

deep learning timeline

This post is a quick check on the timeline including historical dates in relation to the evolution of deep learning. Without further ado, let’s get to the important dates and what happened on those dates in relation to deep learning: Year Details/Paper Information Who’s who 1943 An artificial neuron was proposed as a computational model of the “nerve net” in the brain. Paper: “A logical calculus of the ideas immanent in nervous activity,” Bulletin of Mathematical Biophysics, volume 5, 1943 Warren McCulloch, Walter Pitts Late 1950s A neural network application by reducing noise in phone lines was developed Paper: Andrew Goldstein, “Bernard Widrow oral history,” IEEE Global History Network, 1997 Bernard …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

Machine Learning Techniques for Stock Price Prediction

Stock movement machine learning techniques

In this post, you will learn about some of the popular machine learning techniques in relation to making stock price movement (direction of stock price) predictions and classify whether a stock is a buy, sell, or hold. The stock price prediction problem is a fairly complex problem and different techniques can be used appropriately to achieve good prediction accuracy. Here are the three most popular or common techniques used for building machine learning models for stock price movement (upward / downward) and classifying whether a stock is a buy, sell, or hold: Fundamental analysis: In fundamental analysis (FA), the machine learning models can be trained using data related to companies’ …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Machine Learning – Why use Confidence Intervals?

confidence interval

In this post, you will learn about the concepts of confidence intervals in relation to machine learning models and related concepts with the help of an example and Python code examples.  When you get a hypothesis function by training a machine learning classification model, you evaluate the hypothesis/model by calculating the classification error. The classification error is calculated on the sample of the data used for training the model. However, does this classification error for the sample (sample error) also represent (same as) the classification error of the hypothesis/model for the entire population (true error)? How can the true error be represented as a function of the sample error? This is …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Great Mind Maps for Learning Machine Learning

machine learning mind map

In this post, you will get to look at some of the great mind-maps for learning different machine learning topics. I have gathered these mind maps from different web pages on the Internet. The idea is to reinforce our understanding of different machine learning topics using pictures. You may have heard the proverb – A picture is worth a thousand words.  Keeping this in mind, I thought to pull some of the great mind maps posted on different web pages. I would be updating this blog post from time-to-time.  If you are a beginner data scientist or an experienced one, you may want to bookmark this page for refreshing your …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Different Types of Distance Measures in Machine Learning

Euclidean Distance formula

In this post, you will learn different types of distance measures used in different machine learning algorithms such as K-nearest neighbours, K-means etc. Distance measures are used to measure the similarity between two or more vectors in multi-dimensional space. The following represents different forms of distance metrics / measures: Geometric distances Computational distances Statistical distances Geometric Distance Measures Geometric distance metrics, primarily, tends to measure the similarity between two or more vectors solely based on the distance between two points in multi-dimensional space. The examples of such type of geometric distance measures are Minkowski distance, Euclidean distance and Manhattan distance. One other different form of geometric distance is cosine similarity which will discuss …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Introduction to Algorithms & Related Computational Tasks

Sample-Directed-Acyclic-Graph

In this post, you will be introduced to some of the important class of algorithms and related computational tasks which could be taken care using these algorithms.  Here are some important classes of algorithms which will be briefly discussed in this post: Divide and conquer algorithms Graphs based algorithms Greedy algorithms Dynamic programming Linear programming NP-complete algorithms Quantum algorithms Divide-and-Conquer Algorithms Divide and conquer algorithms are the algorithms which can be used to solve problems using divide and conquer strategy. The following represents the steps of divide-and-conquer algorithms: Breaking it into subproblems that are themselves smaller instances of the same type of problem Recursively solving these subproblems Appropriately combining their …

Continue reading

Posted in Algorithms. Tagged with .

Hold-out Method for Training Machine Learning Models

Hold-out-method-Training-Validation-Test-Dataset

In this post, you will learn about the hold out method used during the process of training machine learning model. When evaluating machine learning (ML) models, the question that arises is whether the model is the best model available from the algorithm hypothesis space in terms of generalization error on the unseen / future data set. Whether the model is trained and tested using the most appropriate method. Out of available models, which model to select? These questions are taken care using what is called as hold out method. Instead of using entire dataset for training, different sets called as validation set and test set is separated or set aside …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Machine Learning Terminologies for Beginners

ML Terminologies Hypothesis Space

When starting on the journey of learning machine learning and data science, we come across several different terminologies when going through different articles/posts, books & video lectures. Getting a good understanding of these terminologies and related concepts will help us understand these concepts in a nice manner. At a senior level, it gets tricky at times when the team of data scientists / ML engineers explain their projects and related outcomes. With this in context, this post lists down a set of commonly used machine learning terminologies that will help us get a good understanding of ML concepts and also engage with the DS / AI / ML team in …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Bias & Variance Concepts & Interview Questions

Bias variance concepts and interview questions

In this post, you will learn about the the concepts of bias & variance in relation to the machine learning (ML) models. In addition to learning the concepts, you would also get a chance to take quiz which would help you prepare for data scientists / ML Engineer interviews. As data scientists / ML Engineer, you must get a good understanding of Bias and Variance concepts in order to build models that generalizes in a better manner or have lower generalization error. Bias & Variance of Machine Learning Models Bias of the model, intuitively speaking, can be defined as affinity of the model to make predictions or estimate based on only …

Continue reading

Posted in Data Science, Interview questions, Machine Learning. Tagged with , , .

Machine Learning Free Course at Univ Wisconsin Madison

Dr Sebastian Raschka Machine Learning Course

In this post, you will learn about the free course on machine learning (STAT 451) recently taught at University of Wisconsin-Madison by Dr. Sebastian Raschka. Dr. Sebastian Raschka in currently working as an assistant Professor of Statistics at the University of Wisconsin-Madison while focusing on deep learning and machine learning research. The course is titled as “Introduction to Machine Learning”. The recording of the course lectures can be found on the page – Introduction to machine learning. The course covers some of the following topics: What is machine learning? Nearest neighbour methods Computational foundation Python Programming (concepts) Machine learning in Scikit-learn Tree-based methods Decision trees Ensemble methods Model evaluation techniques Concepts of …

Continue reading

Posted in Data Science, Machine Learning, Online Courses. Tagged with , , .

Overfitting & Underfitting Concepts & Interview Questions

Overfitting and underfitting represented using Model error vs complexity plot

In this post, you will learn about some of the key concepts of overfitting and underfitting in relation to machine learning models. In addition, you will also get a chance to test you understanding by attempting the quiz. The quiz will help you prepare well for interview questions in relation to underfitting & overfitting. As data scientists, you must get a good understanding of the overfitting and underfitting concepts.  Introduction to Overfitting & Underfitting Assuming independent and identically distributed (I.I.d) dataset, when the prediction error on both the training and test dataset is high, the model is said to have underfit. This is called as underfitting the model or model …

Continue reading

Posted in Data Science, Interview questions, Machine Learning. Tagged with , , .

Reinforcement Learning Real-world examples

Reinforcement-learning-real-world-example

In this post, you will learn about some real-world / real-life examples of Reinforcement learning, one of the different approaches to machine learning where other approaches are supervised and unsupervised learning. Before looking into the real-world examples of Reinforcement learning, let’s quickly understand what is reinforcement learning. Introduction to Reinforcement Learning (RL) Reinforcement learning is an approach to machine learning in which the agents are trained to make a sequence of decisions. The agent, also called as an AI agent gets trained in the following manner: The agent interacts with the environment and make decisions or choices. For training purpose, the agent is provided with the contextual information about the environment and …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .