Categories: Big Data

Learn R – How to Get Random Training and Test Data Set

This article represents sample source code which could be used to extract random training and test data set from a data frame using R programming language. The R code below could prove very handy while you are working to create a model using any machine learning algorithm. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.

 

# Read the data from a file; The command below assumes that the working 
# directory has already been set. One could set working directory using 
# setwd() command.
sample_df <- read.csv("glass.data", header=TRUE, stringsAsFactors=FALSE)

# get a vector comprising of all indices starting from 1 and ending with row number
index <- 1:nrow(sample_df)

# Get random indices of size n from index vector; In command below, the 
# size n is determined using trunc(length(index))/3
randindex <- sample(index, trunc(length(index))/3)

# Get the training set consisting of all the items except one represented using 
# randindex
trainset <- sample_df[-randindex,]
# Get the test set represented using random index
testset <- sample_df[randindex,]
Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Recent Posts

Retrieval Augmented Generation (RAG) & LLM: Examples

Last updated: 25th Jan, 2025 Have you ever wondered how to seamlessly integrate the vast…

4 days ago

How to Setup MEAN App with LangChain.js

Hey there! As I venture into building agentic MEAN apps with LangChain.js, I wanted to…

1 week ago

Build AI Chatbots for SAAS Using LLMs, RAG, Multi-Agent Frameworks

Software-as-a-Service (SaaS) providers have long relied on traditional chatbot solutions like AWS Lex and Google…

2 weeks ago

Creating a RAG Application Using LangGraph: Example Code

Retrieval-Augmented Generation (RAG) is an innovative generative AI method that combines retrieval-based search with large…

3 weeks ago

Building a RAG Application with LangChain: Example Code

The combination of Retrieval-Augmented Generation (RAG) and powerful language models enables the development of sophisticated…

3 weeks ago

Building an OpenAI Chatbot with LangChain

Have you ever wondered how to use OpenAI APIs to create custom chatbots? With advancements…

3 weeks ago