This article represents sample source code which could be used to extract random training and test data set from a data frame using R programming language. The R code below could prove very handy while you are working to create a model using any machine learning algorithm. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.
# Read the data from a file; The command below assumes that the working # directory has already been set. One could set working directory using # setwd() command. sample_df <- read.csv("glass.data", header=TRUE, stringsAsFactors=FALSE) # get a vector comprising of all indices starting from 1 and ending with row number index <- 1:nrow(sample_df) # Get random indices of size n from index vector; In command below, the # size n is determined using trunc(length(index))/3 randindex <- sample(index, trunc(length(index))/3) # Get the training set consisting of all the items except one represented using # randindex trainset <- sample_df[-randindex,] # Get the test set represented using random index testset <- sample_df[randindex,]
Latest posts by Ajitesh Kumar (see all)
- PCA Explained Variance Concepts with Python Example - August 8, 2020
- Eigenvalues & Eigenvectors with Python Examples - August 7, 2020
- Why & When to use Eigenvalues & Eigenvectors? - August 6, 2020