- Vector
- List
- Factor
- Data Frame
- Matrix
- Array
Following is the description of above-mentioned data types:
- Vector: A collection of things of same data types. Function c() is used to represent vectors. You may want to use vector when you need to store data of same types. Following is the code sample:
names = c( "Chris", "James", "Ted") # executing names prints following: > names [1] "Chris" "James" "Ted" age = c(48, 56, 50) # executing age prints following: > age [1] 48 56 50
- List: A collection of things that may be of different data types. Function list() is used to represent list. Following code demonstrates the list:
# The list below consists of two vectors, names and age which consists of two different data types l = list( names, age ) # executing l would print following: > l [[1]] [1] "Chris" "James" "Ted" [[2]] [1] 48 56 50 # Following is another example showing character, integer and numeric stored in the list l = list( "Chris", 28, 129000.5 ) # executing l would display following: > l [[1]] [1] "Chris" [[2]] [1] 28 [[3]] [1] 129000.5
- Factor: A collection of things used to represent nominal variables. “Factor” collection is used to store data which can be categorized into different units. For example, gender is classified as male and female. Following shows how to store gender as a factor.
gender = c("male", "female") f = factor(gender) # Typing command f prints following: > f [1] male female Levels: female male
- Data Frame: Data frame can be visualized of as a database with a set of rows and columns. Command data.frame is used to store the data in this data structure. Following code example demonstrate the usage of data frame:
# Let's store names and age vector in the data frame d = data.frame( names, age ) # executing d would print following: > d names age 1 Chris 48 2 James 56 3 Ted 50
From example, you may infer that data frame needs to have vector of same size.
- Matrix: Matrix is used to represent data elements in form of rows and column. Command “matrix” is used to store data elements in this type of data structure. Following code example demonstrates the usage of matrix:
m = matrix( 1:10, nrow=2, ncol=5) # executing m would print following: > m [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10
In code example above, 1:0 represents the sequence of number starting from going upto 10. nrow represents number of rows. ncol represents number of columns.
- Array: In R, “array” command is used to store data across multiple different dimensions. One-dimensional arrays may look like vectors. A two-dimensional array is same as matrix. Following demonstrate array with code examples:
a = array( 1:6, c(2,3)) # executing a would print following: > a [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 # Lets look at another example a = array( 1:6, c(2,1,3)) # executing a would print following: > a , , 1 [,1] [1,] 1 [2,] 2 , , 2 [,1] [1,] 3 [2,] 4 , , 3 [,1] [1,] 5 [2,] 6
Pay attention to multiple dimension owing to data stored in array.
R Data Types – Good Bookmarks
Following are different web pages which describes R data types in a great manner:
- Data Types in R: This page presents decent description on R data types with good code examples.
- R Data Types
- Basic Data types in R
- R Data types on Wikibooks
Additionally, you could learn details on R data types by typing following on R console:
- help(command_name). For example, help(data.frame)
- ?command_name. For example, ?data.frame
- Book: First principles thinking for building winning products - April 19, 2022
- Free AI / Machine Learning Courses at Alison.com - November 16, 2021
- 12 Weeks Free course on AI: Knowledge Representation & Reasoning (IIT Madras) - November 14, 2021
I found it very helpful. However the differences are not too understandable for me