Category Archives: Data Science

Learn R – 5 Techniques to Create Empty Data Frames with Column Names

This article represents techniques on how one could create an empty data frame with column names. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. 5 Techniques to Create Empty Data Frames In each of the examples below, the data frame is created with three columns, namely, ‘name’, ‘rating’, ‘relyear’. It represents moview names, ratings, and the release year. # Command data.frame is used df1 <- data.frame(name=””, rating=””, relyear=””, stringsAsFactors=FALSE) # Command data.frame is used df2 <- data.frame(name=character(), rating=character(), relyear=character(), stringsAsFactors=FALSE) # Usage of read.table command to create empty data frame df3 <- read.table(text = “”, colClasses = c(“character”, …

Continue reading

Posted in Data Science. Tagged with .

Data Science – Hypothesis Testing Explained with Examples

Hypothesis Testing Workflow

This article represents some of the key statistical concepts along with examples in relation¬†with how to formulate a hypothesis for hypothesis testing. The knowledge of hypothesis formulation and hypothesis testing would prove key to building various different machine learning models. In later articles, hypothesis formulation for machine learning algorithms such as linear regression, logistic regression models etc., will be explained. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos. Following are the key points described later in this article: What is a hypothesis? How to formulate a hypothesis as Null or Alternate Hypothesis? What is hypothesis testing? What is a …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , , .

12 Most Common Machine Learning Tasks

common machine learning tasks

This article represents some of the most common machine learning tasks that one may come across while trying to solve a machine learning problem. Under each task are also listed a set of machine learning methods that could be used to resolve these tasks. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos. Following are the key machine learning tasks briefed¬†later in this article: Data gathering Data preprocessing Exploratory data analysis (EDA) Feature engineering Training machine learning models of the following kinds: Regression Classification Clustering Multivariate querying Density estimation Dimensionality reduction Model / Algorithm selection Testing and matching Model monitoring …

Continue reading

Posted in AI, Big Data, Data Science, Machine Learning. Tagged with , .

Data Science – How to Scale or Normalize Numeric Data using R

This article represents concepts around the need to normalize or scale the numeric data and code samples in R programming language which could be used to normalize or scale the data. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos. Following are the two different ways which could be used to normalize the data, and thus, described later in this article: Why Normalize or Scale the data? Min-Max Normalization Z-Score Standardization Why Normalize or Scale the data? There can be instances found in data frame where values for one feature could range between 1-100 and values for other feature could …

Continue reading

Posted in AI, Big Data, Data Science. Tagged with , .