Categories: Big Data

Data Science – How to Load Data included with R

This article represents different ways in which data from different R packages could be loaded. One of the important aspect of getting on aboard with Data Science is to play with data as much as possible while one is going through the  learning phase. When doing that, some of the key activities include data loading, data extraction, data wrangling/munging etc. This is where I found that loading data from different R packages is one of the key to get access to these data sets and hence, decided to write this quick article. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.

Following are two different ways in which one could load data included with R:

  • Load data after loading package
  • Load data without loading

The below instructions assumes that you have installed the package prior to loading the datasets available with the package.

 

Load Data after Loading Package

Following is a simple command set one needs to execute to load data available with different R packages. I shall take GGPlot package example.

  • Load the package using command, “require(packageName)”. For example, require(ggplot2)
  • Load the data using command, “data(dataSetName)”. For example, data(diamonds)

Once loaded, you could quickly check upon data using command, “head(dataSetName)”. For example, data(diamonds)

 

Load Data Without Loading Package

With just one command, you could load the dataset without loading the package. Following is the command:

  • Load the dataset using data(dataSetName, package=”packageName”). For example, data(diamonds, package=”ggplot2″)

Again, to make sure that data is loaded correctly, use “head” command. For example, head(diamonds)

If you wanted to check upon all the available datasets available from base package as well as other installed packages, use “data()”. It displays all datasets available with base and installed packages.

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com

Recent Posts

Mean Squared Error vs Cross Entropy Loss Function

Last updated: 28th April, 2024 As a data scientist, understanding the nuances of various cost…

14 hours ago

Cross Entropy Loss Explained with Python Examples

Last updated: 28th April, 2024 In this post, you will learn the concepts related to…

15 hours ago

Logistic Regression in Machine Learning: Python Example

Last updated: 26th April, 2024 In this blog post, we will discuss the logistic regression…

2 days ago

MSE vs RMSE vs MAE vs MAPE vs R-Squared: When to Use?

Last updated: 22nd April, 2024 As data scientists, we navigate a sea of metrics to…

4 days ago

Gradient Descent in Machine Learning: Python Examples

Last updated: 22nd April, 2024 This post will teach you about the gradient descent algorithm…

7 days ago

Loss Function vs Cost Function vs Objective Function: Examples

Last updated: 19th April, 2024 Among the terminologies used in training machine learning models, the…

1 week ago