• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Rstats 101

Learn R Programming Tips & Tricks for Statistics and Data Science

  • Home
  • About
    • Privacy Policy
  • Show Search
Hide Search

3 ways to rank numbers with tidyverse

rstats101 · September 16, 2023 ·

In this tutorial, we will learn 3 ways to rank integers in tidyverse. Tidyverse’s dplyr has three integer ranking functions, row_number(), min_rank(), and dense_Rank(), inspired by SQL. And these integer ranking functions differ in how they handle ties.

library(tidyverse)
packageVersion("dplyr")
[1] '1.1.2'

Let us jump into simple examples as given by dplyr and create tibble with a sorted column with ties.

df <- tibble(x = c(10,20,20,60))
print(df)

# A tibble: 4 × 1
      x
  <dbl>
1    10
2    20
3    20
4    60

unique rank with row_number()

row_number() gives every input a unique rank, so that c(10, 20, 20, 30) would get ranks c(1, 2, 3, 4). It’s equivalent to rank(ties.method = “first”).

df %>%
  mutate(row_no =  row_number(x))

# A tibble: 4 × 2
      x row_no
  <dbl>  <int>
1    10      1
2    20      2
3    20      3
4    60      4

min_rank(): lowest rank for all tied elements

min_rank() function deals with any ties by assigning the lowest rank to all tied elements. For example

df %>%
  mutate(min_rank =  min_rank(x))

# A tibble: 4 × 2
      x min_rank
  <dbl>    <int>
1    10        1
2    20        2
3    20        2
4    60        4

dense_rank(): ranking with no gaps

dense_rank() is similar to min_rank() in that it provides the same smallest rank to tied elements, but it does not leave any gaps unlike min_rank(). For example

df %>%
  mutate(dense_rank =  dense_rank(x))

# A tibble: 4 × 2
      x dense_rank
  <dbl>      <int>
1    10          1
2    20          2
3    20          2
4    60          3

3 ranking functions in action

The previous examples showed how the three ranking functions work and their difference. Now let us see another example where the original column is not sorted.

Our data looks like this.

df2 <- tibble( y = c(8,5,4,4,6))
print(df2)

# A tibble: 5 × 1
      y
  <dbl>
1     8
2     5
3     4
4     4
5     6

The ranking function row_number() would give us

df2 %>%
  mutate(row_no =  row_number(y))

# A tibble: 5 × 2
      y row_no
  <dbl>  <int>
1     8      5
2     5      3
3     4      1
4     4      2
5     6      4

The ranking function min_rank() would give us

df2 %>%
  mutate(min_rank =  min_rank(y))

# A tibble: 5 × 2
      y min_rank
  <dbl>    <int>
1     8        5
2     5        3
3     4        1
4     4        1
5     6        4

The ranking function dplyr’s dense_rank() would give us

df2 %>%
  mutate(dense_rank =  dense_rank(y))

# A tibble: 5 × 2
      y dense_rank
  <dbl>      <int>
1     8          4
2     5          2
3     4          1
4     4          1
5     6          3

Related

Filed Under: dplyr, dplyr dense_rank(), dplyr min_rank(), dplyr row_number(), rstats101 Tagged With: ranking integers

Primary Sidebar

Recent Posts

  • How to create a nested dataframe with lists
  • How to compute proportion with tidyverse
  • How to Compute Z-Score of Multiple Columns
  • How to drop unused level of factor variable in R
  • How to compute Z-score

Categories

%in% arrange() as.data.frame as_tibble built-in data R colSums() R cor() in R data.frame dplyr dplyr across() dplyr group_by() dplyr rename() dplyr rowwise() dplyr row_number() dplyr select() dplyr slice_max() dplyr slice_sample() drop_na R duplicated() gsub head() impute with mean values is.element() linear regression matrix() function na.omit R NAs in R near() R openxlsx pivot_longer() prod() R.version replace NA replace NAs tidyverse R Function rstats rstats101 R version scale() sessionInfo() t.test() tidyr tidyselect tidyverse write.xlsx

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version