In this tutorial, we will learn how to add unique row number to each row to a dataframe/tibble. We will use dply’r row_number() function to add unique row number as acolumn to a dataframe using tidyverse first. Then we will also see an example of adding a row number to a dataframe using base R function.
Let us load tidyverse the suit of R packages from RStudio and this includes dplyr as well. Also just verify the dplyr’s version.
library(tidyverse) packageVersion("dplyr") [1] ‘1.0.7’
To illustrate how to add unique row number to a dataframe, we will use “faithful” dataset, classic waiting and eruptions data faithful, but with 2d density estimate. faithful is one of the datasets builtin with ggplot2 package.
Let us take a look at the faithfuld dataset using head() function.
faithfuld %>% head() ## # A tibble: 6 × 3 ## eruptions waiting density ## <dbl> <dbl> <dbl> ## 1 1.6 43 0.00322 ## 2 1.65 43 0.00384 ## 3 1.69 43 0.00444 ## 4 1.74 43 0.00498 ## 5 1.79 43 0.00542 ## 6 1.84 43 0.00574
How to add unique row number to a dataframe in R using tidyverse
In order to add unique row number as one of the variables or columns to the dataset, we will use row_number() function with mutate() function from dplyr as shown below. Here we are assigning row number to a variable or column name “row_id”.
faithfuld %>% mutate(row_id=row_number()) ## # A tibble: 5,625 × 4 ## eruptions waiting density row_id ## <dbl> <dbl> <dbl> <int> ## 1 1.6 43 0.00322 1 ## 2 1.65 43 0.00384 2 ## 3 1.69 43 0.00444 3 ## 4 1.74 43 0.00498 4 ## 5 1.79 43 0.00542 5 ## 6 1.84 43 0.00574 6 ## 7 1.88 43 0.00592 7 ## 8 1.93 43 0.00594 8 ## 9 1.98 43 0.00581 9 ## 10 2.03 43 0.00554 10 ## # … with 5,615 more rows
Move unique row number column to the front with relocate() function in dplyr
Notice that the new column “row_id” is the last column in the dataframe. That is because, by default. mutate() function creates a new column at the end of all existing columns in the dataframe.
To move a column to the first place, first column in the dataframe, we can use relocate() function with the column name of interest. In this example, we are re-locating row_id column from last to the first column in the dataframe.
faithfuld %>% mutate(row_id=row_number()) %>% relocate(row_id) ## # A tibble: 5,625 × 4 ## row_id eruptions waiting density ## <int> <dbl> <dbl> <dbl> ## 1 1 1.6 43 0.00322 ## 2 2 1.65 43 0.00384 ## 3 3 1.69 43 0.00444 ## 4 4 1.74 43 0.00498 ## 5 5 1.79 43 0.00542 ## 6 6 1.84 43 0.00574 ## 7 7 1.88 43 0.00592 ## 8 8 1.93 43 0.00594 ## 9 9 1.98 43 0.00581 ## 10 10 2.03 43 0.00554 ## # … with 5,615 more rows
Adding row number using base R
We can also add row number to the dataframe using base R way. First we create a variable containing row numbers. Here we use seq() function to create a vector containing sequence of numbers. It is of the same size as the number of rows in the dataframe.
# create sequence of number of size equal # to the number of rows of dataframe row_id <- seq(1, nrow(faithfuld))
head(row_id) ## [1] 1 2 3 4 5 6
Then we can add the vector to the dataframe using $ symbol.
faithfuld$row_id = row_id
And we get a dataframe with column containing the row number.
head(faithfuld) ## # A tibble: 6 × 4 ## eruptions waiting density row_id ## <dbl> <dbl> <dbl> <int> ## 1 1.6 43 0.00322 1 ## 2 1.65 43 0.00384 2 ## 3 1.69 43 0.00444 3 ## 4 1.74 43 0.00498 4 ## 5 1.79 43 0.00542 5 ## 6 1.84 43 0.00574 6
You might also want to check out this post on adding row number by group using row_number() this post on adding row number per each group using row_number().
[…] numbers to each group in a dataframe based on a grouping variable in two ways. First, we will use dplyr’s row_number() function in combination with group_by() to add numbers within each group. And then we will use […]