Categories: Big Data

Learn R – How to Fix Read.Table Command Reading Lesser Rows

This article represents the problem statement related with read.table reading fewer or incorrect or lesser number of lines or rows when reading a text file having multiple columns, and the solution to the same. This is going to be a shorter blog. But since it solved a problem on which I spent some time, I chose to write about the same. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.
Problem Statement: Reading Fewer Lines with read.table Command

I have been learning the naive bayes classification. I downloaded this SMS collection data. I went ahead and tried to load the data using following command. And, it listed around 1630 rows, although there were 5574 rows.

messages <- read.table( file.choose(), sep="\t", stringsAsFactors=FALSE)

I check with commands such as dim(messages) and it gave me 1630 messages with 2 columns. This is lesser (and thus, incorrect) than what existed in the document.

Solution to getting exact number of rows

After investigation, I found that the messages consisted of single/double quotes and this needed to be disabled for read.table to read correct number of rows. I did the same with following command and it worked pretty well. Note the usage quote=” parameter.

messages <- read.table( file.choose(), sep="\t", stringsAsFactors=FALSE, quote='')

 

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning and BI. I would love to connect with you on Linkedin. Check out my books titled as Designing Decisions, and First Principles Thinking.

Recent Posts

The Watermelon Effect: When Green Metrics Lie

We’ve all been in that meeting. The dashboard on the boardroom screen is a sea…

1 week ago

Coefficient of Variation in Regression Modelling: Example

When building a regression model or performing regression analysis to predict a target variable, understanding…

3 months ago

Chunking Strategies for RAG with Examples

If you've built a "Naive" RAG pipeline, you've probably hit a wall. You've indexed your…

3 months ago

RAG Pipeline: 6 Steps for Creating Naive RAG App

If you're starting with large language models, you must have heard of RAG (Retrieval-Augmented Generation).…

3 months ago

Python: List Comprehension Explained with Examples

If you've spent any time with Python, you've likely heard the term "Pythonic." It refers…

3 months ago

Large Language Models (LLMs): Four Critical Modeling Stages

Large language models (LLMs) have fundamentally transformed our digital landscape, powering everything from chatbots and…

6 months ago