How to collapse multiple rows based on a column

In this tutorial, we will learn how to collapse multiple rows from a column to a single row based on another column/group.

Let us get started by loading tidyverse and checking the tidyr package version.

library(tidyvrerse)
packageVersion("tidyr")

## [1] '1.2.0'

To illustrate collapsing multiple rows into single one, let us create a toy dataframe with continents and countries.

df <- tibble(
  continent = c("America","Europe","America","Asia","Europe","Europe"),
  country = c("USA","England","Canada", "Singapore","France", "Germany")
)

Our aim is to collapse countries from same continent into single row. This is very similar to grouping and summarising, but with text/string data.

df
## # A tibble: 6 × 2
##   continent country  
##   <chr>     <chr>    
## 1 America   USA      
## 2 Europe    England  
## 3 America   Canada   
## 4 Asia      Singapore
## 5 Europe    France   
## 6 Europe    Germany

Using group_by(), summarize(), and pasete0() we can collapse multiple rows into single row. Here we group by continent and summarize country by pasting with comma delimitter.

df %>% 
  group_by(continent) %>%
  summarize(country=paste0(country, collapse = ", "))

And this is how it looks after collapsing multiple rows belonging to different country from a continent.


## # A tibble: 3 × 2
##   continent country                 
##   <chr>     <chr>                   
## 1 America   USA, Canada             
## 2 Asia      Singapore               
## 3 Europe    England, France, Germany
Exit mobile version