In this tutorial, how to add numbers to rows with in each group in a dataframe in R. The picture below illustrates an example of adding row numbers per group.
Packages And Data
We will first create sample dataframe using tibble function and then use dplyr’s functions to add sequence of numbers to each group in a dataframe based on a grouping variable in two ways. First, we will use dplyr’s row_number() function in combination with group_by() to add numbers within each group. And then we will use dplyr’s n() function to do the same.
Let us first load tidyverse.
library(tidyverse)
We create a a simple dataframe using tibble with two groups.
df <- tibble(grp = sample(c("g1","g2"), 6, replace=TRUE), counts = sample(1:20,6) )
Note the grouping variable grp has two unique values.
df ## # A tibble: 6 × 2 ## grp counts ## <chr> <int> ## 1 g2 4 ## 2 g1 5 ## 3 g1 14 ## 4 g2 19 ## 5 g2 2 ## 6 g2 8
How to add numbers to rows within each group in a data frame
TO add sequence of number to rows within each group we will first use group_by() function in dplyr on the column that contains the groups. In this example, “grp” column contains grouping variable. After applying group_by() we can mutate to add a column and use row_number() function to add numbers to each row within a group.
df %>% group_by(grp) %>% mutate(id=row_number())
## # A tibble: 6 × 3 ## # Groups: grp [2] ## grp counts id ## <chr> <int> <int> ## 1 g2 4 1 ## 2 g1 5 1 ## 3 g1 14 2 ## 4 g2 19 2 ## 5 g2 2 3 ## 6 g2 8 4
Note that the grouping variable does not have be ordered for this to work.
df %>% group_by(grp) %>% mutate(id=row_number()) %>% arrange(grp)
## # A tibble: 6 × 3 ## # Groups: grp [2] ## grp counts id ## <chr> <int> <int> ## 1 g1 5 1 ## 2 g1 14 2 ## 3 g2 4 1 ## 4 g2 19 2 ## 5 g2 2 3 ## 6 g2 8 4
Adding numbers to rows within each group in a data frame: Second way
Here is another way to add row numbers with in each group using dplyr functions in R. In this example, we use n() instead of row_number(). The dplyr function n() gets the number of rows and here it gives us the number of rows in the current grouped dataframe.
df %>% group_by(grp) %>% mutate(id=1:n())
## # A tibble: 6 × 3 ## # Groups: grp [2] ## grp counts id ## <chr> <int> <int> ## 1 g2 4 1 ## 2 g1 5 1 ## 3 g1 14 2 ## 4 g2 19 2 ## 5 g2 2 3 ## 6 g2 8 4
You may be interested in learning dplyr row_number(): Add unique row number to a dataframe