Categories: Big Data

Learn R – How to Convert Columns from Character to Factor

This article represents different ways in which one or more columns in a data frame could be converted to factor when working with R programming language. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos.




Following are the key points described later in this article:

  • Convert single column to factor
  • Convert multiple columns to factor

Following data frame, df, is used in the code sample below:

  param_a param_b param_c diagnosis param_d
1      23    0.61   10452  positive       y
2      18    0.85    9876  positive       n
3      22    0.32    6534  negative       y
4      37    0.56    8743  positive       y
5      15    0.44    9876  negative       n
6      25    0.13    4321  negative       n
7      55    0.51    7685  positive       y

In above data frame, both diagnosis and param_d are character vectors. One could quickly check classes of all columns using the following command:

sapply(df, class)

Convert Single Column to Factor

Following is demonstrated the code samples along with help text. Pay attention that one could use lapply method to change the single column to factor. However, it does throw warning message.

# Invoke as.factor method on dataframe$columnName
df$param_d <- as.factor(df$param_d)

# Invoke as.factor method on columns represented array notation
df[, 'param_d'] <- as.factor( df[, 'param_d'] )

# Use lapply method; Both of below makes param_d column as factor
df[, 'param_d'] <- lapply(df[, 'param_d'], factor)
df[, c("param_d")] <- lapply(df[, c("param_d")], factor)

Convert Multiple Columns to Factor

Use lapply method to change columns to factor.

df[, c("param_d", "diagnosis")] &lt;- lapply(df[, c("param_d", "diagnosis")], factor)
Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com

Recent Posts

Feature Engineering in Machine Learning: Python Examples

Last updated: 3rd May, 2024 Have you ever wondered why some machine learning models perform…

17 hours ago

Feature Selection vs Feature Extraction: Machine Learning

Last updated: 2nd May, 2024 The success of machine learning models often depends on the…

1 day ago

Model Selection by Evaluating Bias & Variance: Example

When working on a machine learning project, one of the key challenges faced by data…

2 days ago

Bias-Variance Trade-off in Machine Learning: Examples

Last updated: 1st May, 2024 The bias-variance trade-off is a fundamental concept in machine learning…

2 days ago

Mean Squared Error vs Cross Entropy Loss Function

Last updated: 1st May, 2024 As a data scientist, understanding the nuances of various cost…

2 days ago

Cross Entropy Loss Explained with Python Examples

Last updated: 1st May, 2024 In this post, you will learn the concepts related to…

2 days ago