Tag Archives: machine learning

Data Science – Who could become a Data Scientist?

This article represents information related different classes of IT & Non-IT professionals who could take on different data science free courses (as mentioned) and get on to the path of becoming a data scientist. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the different classifications of IT/Non-IT professional which has been addressed later in this article: Software Development Stakeholders working on Non-analytics projects Datawarehouse/BI Developers Big Data Developers Statisticians Senior Management Executive Non-Software Professionals Could I become a Data Scientist? Anyone matching following criteria could become a data scientist. One is decent with Mathematics & Statistics …

Continue reading

Posted in Big Data. Tagged with , .

Top 10 Solution Approaches for Supervised Learning Problems

This article represents top 10 solutions approaches that could be used to solve supervised learning problems. For those unaware of what is supervised learning problem, here is the supervised learning definition from Wikipedia: Supervised learning is the machine learning task of inferring a function from labeled training data.[1] The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. Following are two different kind of supervised …

Continue reading

Posted in Big Data. Tagged with , .

Learn R or Python for Becoming Data Scientist?

This article presents analysis on whether one should go for learning R or Python programming language to create one or more predictive models using different machine learning algorithms. It could be noted that both languages, R and Python, is equally doing good and sought after by developers and the companies hiring such developers. So, you could choose either one of these languages. However, majority has been found to be voted in favour of Python for ease of learning and greater community support.   Data Scientist with expertise in R Following indeed.com plot represents the job trends for the search term, “Data Scientist R”. It clearly indicates the trend such as …

Continue reading

Posted in Big Data. Tagged with , .

Machine Learning – Top 16 Learning Resources on Statistics

This article represents some of the top learning resources (webpages, videos etc) on my frequent visit list. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key categories of webpages/videos that are expanded later in this article: Websites Quora Youtube Videos Coursera courses Khan Academy Top 16 Learning Resources on Statistics Folllowing is the list of URLs for these learning resources: Websites on Statistics Stattrek.com Elementary Statistics with R StatsDirect.com Usable Stats Quora.com Statistics Channel Probability & Statistics Statistics (Acacedmic Discipline) Bayesian Inference Youtube Videos Playlists on Statistics Brandon Foltz StatisticsFun JBStatistics Quantitative Specialists Coursera Courses …

Continue reading

Posted in Big Data. Tagged with , .

Machine Learning Research in Top 10 US Universities

This article represents information related with machine learning departments & related research projects in top 10 US universities (as per USNews Ranking). I have put it together for my quick reference and thought to share with you for the same purpose. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are top 10 universities covered later in this article: Princeton University Harvard University Yale University Columbia University Stanford University University of Chicago MIT Duke University University of Pennsylvania California Institue of Technology   Machine Learning @ Top 10 US Universities Princeton University: Machine Learning Department at Princeton University …

Continue reading

Posted in Big Data. Tagged with .

Machine Learning – Top 5 Video Channels for Regression Models

This article represents top 5 video channels that one could use to learn and become expert at regression models.  I make visits to watch these videos, once in a while, to clarify my doubts in relation with regression models. As I find these pages very useful, I thought it to share with you all. These are some real good videos from learning perspective that could help you get started with regression models and get a good hang of it within no time. Please feel free to share it with your community. Please feel free to comment/suggest if I missed to mention any other great video channels. Also, sorry for the …

Continue reading

Posted in Big Data. Tagged with , , .

Data Science – Top 10 Websites to Bookmark for Daily News

top 10 data science websites

This article represents links and information in relation with top 10 websites that publishes data science related news and article on daily/regular basis. These links are my favorites and help me remain up-to-date with latest and greatest happening in the field of data science. Please feel free to comment/suggest if I missed to mention/include one or more important and interesting websites in the list given below. Also, sorry for the typos. Following are the key points described later in this article: Top 5 Data Science News Websites – Recommended Daily Visit Top 5 Data Science News Websites – Recommended Regular Visit   Top 5 Data Science News Websites – Recommended …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – Mathematical Concepts for Linear Regression Models

linear regression model

This article represents some of the key mathematics & statistics concepts that one may need to learn in order to work with linear regression models. Understanding following concepts would help in some of the following manners in relation with evaluating linear regression models: Interpreting coefficients Evaluating the regression model Comparing multiple regression models and choosing the best out of them Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key mathematical concepts/topics described later in this article: Statistical hypothesis testing Probability distributions Quantitative data analysis Plots   Key Mathematics & Statistics Topics for Linear Regression Models …

Continue reading

Posted in Big Data. Tagged with , , .

Learn R – How to Get Random Training and Test Data Set

This article represents sample source code which could be used to extract random training and test data set from a data frame using R programming language. The R code below could prove very handy while you are working to create a model using any machine learning algorithm. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.   # Read the data from a file; The command below assumes that the working # directory has already been set. One could set working directory using # setwd() command. sample_df <- read.csv(“glass.data”, header=TRUE, stringsAsFactors=FALSE) # get a vector comprising of all indices …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – Bookmarks for Great Tutorials, Books & Videos

This article represents quick bookmarks on some good machine learning web pages including tutorials’ documents and videos. Please feel free to comment/suggest if you know of further good bookmarks. I shall be adding more bookmarks in time to come. Also, sorry for the typos. Following are the key bookmarks: List of Tutorial Pages on Different Machine Learning Topics: You shall surely want to bookmark this page as it consists of some real cool links covering different topics in machine learning. List of Machine Learning Books: Those looking out for machine learning books to get started would want to bookmark this page which consists of list of some great books recommended …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – When to Use Logistic Regression vs. SVM

Logistic Regression vs SVM

This article represents guidelines based on which one could determine whether to use Logistic regression or SVM with Kernels when working on a classification problem. These are guidelines which I gathered from one of the Andrew NG videos on SVM from his machine learning course in Coursera.org. As I wanted a place to reach out quickly in future when I am working on classification problem and, want to refer which algorithm to use out of Logistic regression or SVM, I decided to blog it here. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Key Criteria for Using Logistic Regression vs …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – When to Use Linear vs Guassian Kernel with SVM

This article represents guidelines which could be used to decide whether to use Linear kernel or Gaussian kernel when working with Support Vector Machine (SVM). Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: When to Use Linear Kernel When to Use Gaussian Kernel   When to Use Linear Kernel In case there are large number of features and comparatively smaller number of training examples, one would want to use linear kernel. As a matter of fact, it can also be called as SVM with No Kernel. One may …

Continue reading

Posted in Big Data. Tagged with , , .

Top 7 Should-Have Skills of A Data Scientist

With all the hype around data scientist as one of the most lucrative career option in the recent times, it is but natural that we may get tempted to explore on whether we have in ourselves what it may take to become a successful data scientist. As a matter of fact, I have come across this question very frequently as to what would it take to become a data scientist. Well, this question has been addressed numerous times in many articles. However, I wanted to present a fresh perspective based on the grilling and rigorous journey of Data Science that I went through, in last year or so. Based out …

Continue reading

Posted in Big Data. Tagged with , , .

8 Key Steps to Follow When Solving A Machine Learning Problem

This article represents some of the key steps one could take in order to create most effective model to solve a given machine learning problem, using different machine learning algorithms. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. 8 Key Steps for Solving A Machine Learning Problem Gather the data set: This is one of the most important step where the objective is to as much large volume of data set as possible. Given that features have been selected appropriately, large data set helps to minimize the training data set error and also, enable cross-validation and training data set error …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – How to Debug Learning Algorithm for Regression Model

This article represents some of the key reasons for larger prediction error while working with regression models and, what one could do to solve the prediction error. Below mentioned techniques could be used for both, linear and logistic regression models. As a matter of fact, below arguments could also be used to debug an artificial neural network. In place of features, what is considered is number of hidden layers and units. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: Key Reasons for Larger Prediction Error Key Techniques to …

Continue reading

Posted in Big Data. Tagged with , , .

Machine Learning – How to Diagnose Underfitting/Overfitting of Learning Algorithm

This article represents technique that could be used to identify whether the Learning Algorithm is suffering from high bias (under-fitting) or high variance (over-fitting) problem. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key problems related with learning algorithm that are described later in this article: Under-fitting Problem Over-fitting Problem   Diagnose Under-fitting & Over-fitting Problem of Learning Algorithm The challenge is to identify whether the learning algorithm is having one of the following: High bias or under-fitting: At times, our model is represented using polynomial equation of relatively lower degree, although a higher degree of …

Continue reading

Posted in Big Data. Tagged with , , .