While I aim to introduce data structures: vectors, dataframes, lists, matrices, however our key focus is on dataframes
Learning objectives
To understand data types: vectors, dataframes, lists, matrices
To do basic analysis
Please Read
2.1 Vectors
Remember objects we created in the previous section)? Those were all vectors. A vector is the basic data structure used to hold values of the same type. Similar to the previous section, a vector can be:
numeric
character
logical
Although we are repeating stuff from previous section, but it worth it.
2.1.1 Character vector
Let us create a character vector of countries in Southern Africa:
We have created vector that named southern_africa, and it has the countries in the Southern African Region. Let us use basic functions to examine our southern_africa vector. We can get the type of vector by using class() function:
class(southern_africa)
[1] "character"
It is character vector. Remember from the previous section what is the character data type.
We can examine the length by using length() function:
length(southern_africa)
[1] 10
We have 10 elements in the southern_africa vector
2.1.2 Numeric vector
Let us create a numeric vector, that we name life_expectancy, that has the average life expectancy of the countries of Southern Africa:
We can confirm the type of vector we have created by using the class() function:
class(life_expectancy)
[1] "numeric"
Indeed, the life_expectancy vector is a numeric vector.
Let us do basic analyses of this vector. We can get the mean by using mean() functions:
mean(life_expectancy)
[1] 59.72
We can get the median and standard deviation of life_expectancy vector using median() and sd() functions, respectively:
median(life_expectancy)
[1] 60.2
sd(life_expectancy)
[1] 2.898582
You can get an element of vector by using [] function. Let us get the first element in life_expectancy vector:
life_expectancy[1]
[1] 61.6
To get the 1st, 5th, 8th elements within a vector, you would do the following:
life_expectancy[c(1, 5, 8)]
[1] 61.6 62.9 62.3
You can also extract the vector elements by using the colon (:):
life_expectancy[3:6]
[1] 57.1 53.1 62.9 59.3
Here, we wanted to get all the elements starting from the 3rd position to the 6th position.
Key lesson: a vector holds items of a similar type: as we have seen in the southern_countries and life_exepctancy vectors.
2.2 Dataframes
Dataframes will be the key focus throughout the course, so I will just briefly explain what is a dataframe. A dataframe is tabular data format, consisting of columns and rows. Let us use an example by creating a dataframe in R:
# Create a character vectorcountry_names <-c("Angola", "Botswana", "Lesotho", "Malawi", "Mozambique", "Namibia", "South Africa", "Swaziland", "Zambia", "Zimbabwe")country_names
We have created africa_df dataframe, with columns and rows. Let us examine it. How many columns and rows are in the dataframe. We can use the str() function:
Did you see that Jimmy? We actually printed the list. As you advance in your programming with R, you will see why lists are important and how everything is a lit.