Tag Archives: datascience

Most Common Machine Learning Tasks

common machine learning tasks

This article represents some of the most common machine learning tasks that one may come across while trying to solve machine learning problems. Also listed is a set of machine learning methods that could be used to resolve these tasks. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos. You might want to check out the post on what is machine learning?. Different aspects of machine learning concepts have been explained with the help of examples. Here is an excerpt from the page: Machine learning is about approximating mathematical functions (equations) representing real-world scenarios. These mathematical functions are also referred …

Continue reading

Posted in AI, Big Data, Data Science, Machine Learning. Tagged with , .

Pandas – How to Concatenate Dataframe Columns

data frame concatenation by columns

Quick code sample on how to concatenate the data frames columns. We will work with example of Boston dataset found with sklearn.datasets. One should note that data frames could be concatenated by rows and columns. In this post, you will learn about how to concatenate data frames by columns. Here is the code for working with Boston datasets. First and foremost, the Boston dataset will be loaded. Once loaded, let’s create different different data frames comprising of data and target variable. This above creates two data frames comprising of data (features) and the values of target variable. Here are the snapshots. Use the following command to concatenate the data frames. …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , , .

Top 7 Should-Have Skills of A Data Scientist

With all the hype around data scientist as one of the most lucrative career option in the recent times, it is but natural that we may get tempted to explore on whether we have in ourselves what it may take to become a successful data scientist. As a matter of fact, I have come across this question very frequently as to what would it take to become a data scientist. Well, this question has been addressed numerous times in many articles. However, I wanted to present a fresh perspective based on the grilling and rigorous journey of Data Science that I went through, in last year or so. Based out …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – How to Debug Learning Algorithm for Regression Model

This article represents some of the key reasons for larger prediction error while working with regression models and, what one could do to solve the prediction error. Below mentioned techniques could be used for both, linear and logistic regression models. As a matter of fact, below arguments could also be used to debug an artificial neural network. In place of features, what is considered is number of hidden layers and units. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: Key Reasons for Larger Prediction Error Key Techniques to …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – How to Diagnose Underfitting/Overfitting of Learning Algorithm

This article represents technique that could be used to identify whether the Learning Algorithm is suffering from high bias (under-fitting) or high variance (over-fitting) problem. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key problems related with learning algorithm that are described later in this article: Under-fitting Problem Over-fitting Problem   Diagnose Under-fitting & Over-fitting Problem of Learning Algorithm The challenge is to identify whether the learning algorithm is having one of the following: High bias or under-fitting: At times, our model is represented using polynomial equation of relatively lower degree, although a higher degree of …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – 7 Steps to Train a Neural Network

7 Steps to Train a Neural Network

This article represents some of the key steps required to train a neural network. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Key Steps for Training a Neural Network Following are 7 key steps for training a neural network. Pick a neural network architecture. This implies that you shall be pondering primarily upon the connectivity patterns of the neural network including some of the following aspects: Number of input nodes: The way to identify number of input nodes is identify the number of features. Number of hidden layers: The default is to use the single or one hidden …

Continue reading

Posted in Big Data. Tagged with , , .

Big Data – Top Education Resources from MIT

MIT CSAIL Big Data

This article represents information on Big Data initiative from MIT (Massachusetts Institute of Technology) including bookmarks on lecture notes related machine learning courses and also, machine learning video channel from MIT on Youtube. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: MIT CSAIL Big Data Initiative Machine Learning Lecture Notes & Videos   MIT CSAIL Big Data Initiative MIT has a website dedicated to Big Data initiative from MIT CSAIL (Computer Science and Artificial Intelligence Laboratory). Following pages are worth visits to understand ongoing research and listen/view talks …

Continue reading

Posted in Big Data. Tagged with , , .

Weekly Roundup – Machine Learning & Statistics Bookmarks – 02 Feb 2015

This article represents links to some of cool pages on machine learning & statistics that I thought worth sharing. Please feel free to comment/suggest any other webpages that found to be good. Sorry for the typos. Machine Learning & Statistics Bookmarks Andrew NG: One starting to learn machine learning is sure to come across course, paper, or a web page related with Andrew NG, an Associate Professor at Stanford; Chief Scientist of Baidu; and Chairman and Co-Founder of Coursera. Some of the pages sighting his work are following: Courses Publications Research Andrew W. Moore: Great set of tutorials by Andrew D. More, who is Dean of the School of Computer …

Continue reading

Posted in Big Data. Tagged with , .

Machine Learning – 9 Most Common Usecases for Higher Business Growth

This article represents some of the most common use cases of machine learning algorithms which has been found to impact business growth (in terms of revenues) in a positive manner. These usecases could be most commonly seen with all businesses which are running some or the other form of ecommerce site to support one or more aspects of their business. I have tried and provide information regarding which algorithm (or class of algorithm) could be used to come up with a solution for these usecases. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are different areas, at …

Continue reading

Posted in Big Data. Tagged with , .

Top 4 Machine Learning Usecases for Energy Forecasting

machine learning usecases for energy forecasting

This article represents top 4 machine learning usecases for energy forecasting. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Machine Learning Usecases for Energy Forecasting Following are different usecases in relation with energy management where machine learning could be used for probabilistic energy forecasting. For those who are new to probabilistic forecasting, here is the definition from Wikipedia: Probabilistic forecasting summarises what is known, or opinions about, future events. In contrast to a single-valued forecasts (such as forecasting that the maximum temperature at given site on a given day will be 23 degrees Celsius or that the result …

Continue reading

Posted in Big Data. Tagged with , .

Machine Learning Usecases for Pinterest.com & related Kosei Acquisition

This article represents thoughts on recent acquisition of Kosei, a commerce recommendation system, by Pinterest.com. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: How could Machine Learning help Pinterest fuel its overall growth? How could Kosei help Pinterest.com?   How could Machine Learning help Pinterest fuel its overall growth? Yet another acquisiton in the space of machine learning, Pinterest.com acquires Kosei to achieve some of the following objective: Better ad targeting for greater mometization from ad clicks. This looks to be a case of identifying users clusters based …

Continue reading

Posted in Big Data. Tagged with , .

Data Science – List of Common Machine Learning Problems with Examples

This article represents quick examples for 5 different classes of machine learning problems/tasks. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following is a set of 5 key machine learning problems/tasks whose examples have been listed later in this article: Regression Classification Clustering Association Rules Artificial Neural Networks   Examples – Regression Models Real Estate – Housing price estimation Financial – Stock price estimation Insurance – Estimate medical care expenses Sales & Marketing – Sales vs Ad spend Company growth estimation   Examples – Classification Models Following are four different algorithms whose examples have been listed below: Naive …

Continue reading

Posted in Big Data. Tagged with .

Cheat Sheet – 10 Machine Learning Algorithms & R Commands

This article lists down 10 popular machine learning algorithms and related R commands (& package information) that could be used to create respective models. The objective is to represent a quick reference page for beginners/intermediate level R programmers who working on machine learning related problems. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the different ML algorithms included in this article: Linear regression Logistic Regression K-Means Clustering K-Nearest Neighbors (KNN) Classification Naive Bayes Classification Decison Trees Support Vector Machine (SVM) Artifical Neural Network (ANN) Apriori AdaBoost Cheat Sheet – ML Algorithms & R Commands Linear regression: …

Continue reading

Posted in Big Data. Tagged with .

API – How to Get Started with Facebook API Integration

This article represents steps to get started with Facebook Graph API. In later articles, I shall explain how to integrate using Java and maybe other programming languages. The primary reason I am hooked to Facebook integration these days is my need for getting exploratory data from facebook for data analysis for my Big Data projects. Before getting onto use framework such as RestFB, it is recommended to play with these APIs in the Facebook-provided playground.  I shall be talking in detail about how to get started with RestFB in later articles. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the …

Continue reading

Posted in Software Quality. Tagged with , , .

Data Science – Top 5 Videos to Get Started with Neural Networks

This article represents some good youtube videos that I found useful to get started with understanding how brain works and what is neural networks. Note that I needed to do this as I wanted to get started with machine learning and neural network algorithm. In order to do that effectively, I needed to understand what are neural networks and videos below helped me get started within an hour. Please feel free to suggest other great videos which I may have missed. Sorry for the typos.   From Neurons to Networks I would rate it as the one of the best videos I saw on how human brain works. MUST watch!!! …

Continue reading

Posted in Big Data. Tagged with , .

Data Science – 3 Key Aspects of Applying KMeans Algorithm for Clustering Tasks

This article represents key concepts around KMeans algorithm including key aspects and formula/R command when you are working on clustering tasks. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: Key aspects of applying KMeans algorithm KMeans Algorithm – R Command   Key aspects of applying KMeans Algorithm Key aspects of applying KMeans algorithm are following: Selecting a right combination of features set: On the data set on which you may observe some of the following: There are one or more features having non-numeric or character data sets. As …

Continue reading

Posted in Big Data. Tagged with .