In the set of commands listed below, a data frame, message_text, is used which is a set of text data, loaded using read.table command such as following:
messages_text <- read.table( file.choose(), sep="\t", stringsAsFactors=FALSE)
Using the commands listed below, following is achieved:
# Find the summary information about the data frame loaded using command such as
# read.csv, read.table etc.
str(messages_text)
# Change the name of the columns to desired names; At times, during loading, the text file
# could start straight away with the data. And, when that happens, the features are names as V1, V2 etc.
# Thus, it may be good idea to name the features appropriately.
names(messages_text) <- c( "type", "text")
# as.factor command is frequenctly used to derive the categorical features as factor. When loaded,
# this variable is loaded as character vector.
messages_text$type <- as.factor(messages_text$type)
# table command when used on variable of class, factor, gives number of occurences of
# different categories
table(messages_text$type)
# prop.table command when used on categorical variable (of class, factor) gives the percentage occurences of
# different categories
prop.table(table(messages_text$type))*100
# round command with prop.table gives the percentage occurence of categorical variable,
# rounded by number of digits specified in the command
round(prop.table(table(messages_text$type))*100, digits=2)
In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…
Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…
With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…
Anxiety is a common mental health condition that affects millions of people around the world.…
In machine learning, confounder features or variables can significantly affect the accuracy and validity of…
Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…