This article represents different ways in which one or more columns in a data frame could be converted to factor when working with R programming language. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos.
Following are the key points described later in this article:
Following data frame, df, is used in the code sample below:
param_a param_b param_c diagnosis param_d 1 23 0.61 10452 positive y 2 18 0.85 9876 positive n 3 22 0.32 6534 negative y 4 37 0.56 8743 positive y 5 15 0.44 9876 negative n 6 25 0.13 4321 negative n 7 55 0.51 7685 positive y
In above data frame, both diagnosis and param_d are character vectors. One could quickly check classes of all columns using the following command:
sapply(df, class)
Following is demonstrated the code samples along with help text. Pay attention that one could use lapply method to change the single column to factor. However, it does throw warning message.
# Invoke as.factor method on dataframe$columnName df$param_d <- as.factor(df$param_d) # Invoke as.factor method on columns represented array notation df[, 'param_d'] <- as.factor( df[, 'param_d'] ) # Use lapply method; Both of below makes param_d column as factor df[, 'param_d'] <- lapply(df[, 'param_d'], factor) df[, c("param_d")] <- lapply(df[, c("param_d")], factor)
Use lapply method to change columns to factor.
df[, c("param_d", "diagnosis")] <- lapply(df[, c("param_d", "diagnosis")], factor)
Last updated: 3rd May, 2024 Have you ever wondered why some machine learning models perform…
Last updated: 2nd May, 2024 The success of machine learning models often depends on the…
When working on a machine learning project, one of the key challenges faced by data…
Last updated: 1st May, 2024 The bias-variance trade-off is a fundamental concept in machine learning…
Last updated: 1st May, 2024 As a data scientist, understanding the nuances of various cost…
Last updated: 1st May, 2024 In this post, you will learn the concepts related to…