13 Solutions || Adv. Data Manipulation
Write one command ( can span multiple lines) using pipes that will output a data frame that has only the columns lifeExp
, country
and year
for the records before the year 2000 from African countries, but not for other Continents.
tidy_africa <- gapminder %>%
dplyr::filter(continent == "Africa") %>%
dplyr::select(year, country, lifeExp)
head(tidy_africa)
## # A tibble: 6 × 3
## year country lifeExp
## <int> <fct> <dbl>
## 1 1952 Algeria 43.1
## 2 1957 Algeria 45.7
## 3 1962 Algeria 48.3
## 4 1967 Algeria 51.4
## 5 1972 Algeria 54.5
## 6 1977 Algeria 58.0
Calculate the average life expectancy per country. Which country has the longest average life expectancy and which one the shortest average life expectancy?
gapminder %>%
dplyr::group_by(country) %>%
dplyr::summarize(mean_lifeExp = mean(lifeExp)) %>%
dplyr::filter(mean_lifeExp == min(mean_lifeExp) | mean_lifeExp == max(mean_lifeExp))
## # A tibble: 2 × 2
## country mean_lifeExp
## <fct> <dbl>
## 1 Iceland 76.5
## 2 Sierra Leone 36.8
In the previous hands-on you discovered that all the entries from 2007 are actually from 2008. Write a command to edit the data accordingly using pipes. In the same command filter only the entries from 2008 to verify the change.
gapminder %>%
dplyr::mutate(year = ifelse(year==2007,2008,year)) %>%
dplyr::filter(year==2008) %>%
head()
## # A tibble: 6 × 6
## country continent year lifeExp pop gdpPercap
## <fct> <fct> <dbl> <dbl> <int> <dbl>
## 1 Afghanistan Asia 2008 43.8 31889923 975.
## 2 Albania Europe 2008 76.4 3600523 5937.
## 3 Algeria Africa 2008 72.3 33333216 6223.
## 4 Angola Africa 2008 42.7 12420476 4797.
## 5 Argentina Americas 2008 75.3 40301927 12779.
## 6 Australia Oceania 2008 81.2 20434176 34435.