In this tutorial, we will learn how to collapse multiple rows from a column to a single row based on another column/group.
Let us get started by loading tidyverse and checking the tidyr package version.
library(tidyvrerse) packageVersion("tidyr") ## [1] '1.2.0'
To illustrate collapsing multiple rows into single one, let us create a toy dataframe with continents and countries.
df <- tibble( continent = c("America","Europe","America","Asia","Europe","Europe"), country = c("USA","England","Canada", "Singapore","France", "Germany") )
Our aim is to collapse countries from same continent into single row. This is very similar to grouping and summarising, but with text/string data.
df ## # A tibble: 6 × 2 ## continent country ## <chr> <chr> ## 1 America USA ## 2 Europe England ## 3 America Canada ## 4 Asia Singapore ## 5 Europe France ## 6 Europe Germany
Using group_by(), summarize(), and pasete0() we can collapse multiple rows into single row. Here we group by continent and summarize country by pasting with comma delimitter.
df %>% group_by(continent) %>% summarize(country=paste0(country, collapse = ", "))
And this is how it looks after collapsing multiple rows belonging to different country from a continent.
## # A tibble: 3 × 2 ## continent country ## <chr> <chr> ## 1 America USA, Canada ## 2 Asia Singapore ## 3 Europe England, France, Germany