slice_min: Get Rows with minimum values of a column

dplyr's slice_min(): Rows with lowest values for a column
dplyr's slice_min(): Rows with lowest values for a column

In this tutorial, we will learn how to get rows with lowest values of a column from a data frame in R. We will use dplyr’s slice_min() function to select rows with lowest values for a column in a data frame. We will also use slice_min() function in dplyr to find the bottom n rows with lowest or min values for a variable.

dplyr’s slice_min(): Rows with lowest values for a column
library(tidyverse)
library(palmerpenguins)

To illustrate how slice_min() works, let us simplify our penguins data by selecting just a few columns with no missing values. We will also add row number using dplyr’s row_number() to quickly find which rows we have selected.

penguins <- penguins %>%
  drop_na() %>%
  select(species, sex, body_mass_g, flipper_length_mm) %>%
  mutate(row_id = row_number())

Our data looks like this.

penguins %>% head()

# A tibble: 6 × 5
  species sex    body_mass_g flipper_length_mm row_id
  <fct>   <fct>        <int>             <int>  <int>
1 Adelie  male          3750               181      1
2 Adelie  female        3800               186      2
3 Adelie  female        3250               195      3
4 Adelie  female        3450               193      4
5 Adelie  male          3650               190      5
6 Adelie  female        3625               181      6

dplyr’s slice_min(): Get the row(s) with minimum value for a column

To find the row with the lowest value of a column, we use slice_min() with the column name and n = 1 as arguments. In our example below, we use slice_min() to get the row with lowest or body mass in our penguins data.

penguins %>%
  slice_min(body_mass_g, n =1)

# A tibble: 1 × 5
  species   sex    body_mass_g flipper_length_mm row_id
  <fct>     <fct>        <int>             <int>  <int>
1 Chinstrap female        2700               192    304

dplyr’s slice_min(): Get bottom 2 rows with minimum values for a column

With slice_min() we can the get the bottom two rows with minimum or lowest values for a column by specifying n=2 in the argument.
For example, when we use n = 2 with body mass column as arguments to slice_min(), we get the bottom two rows containing lowest body mass.

penguins %>%
  slice_min(body_mass_g, n = 2)

# A tibble: 3 × 5
  species   sex    body_mass_g flipper_length_mm row_id
  <fct>     <fct>        <int>             <int>  <int>
1 Chinstrap female        2700               192    304
2 Adelie    female        2850               181     53
3 Adelie    female        2850               184     59

dplyr’s slice_min(): Get bottom n rows with minimum values for a column

Similarly we can get the bottom n rows with lowest values for a column, by specifying an value for n we want.

penguins %>%
  slice_min(body_mass_g, n = 10)

# A tibble: 10 × 5
   species   sex    body_mass_g flipper_length_mm row_id
   <fct>     <fct>        <int>             <int>  <int>
 1 Chinstrap female        2700               192    304
 2 Adelie    female        2850               181     53
 3 Adelie    female        2850               184     59
 4 Adelie    female        2900               187     49
 5 Adelie    female        2900               178     93
 6 Adelie    female        2900               188    111
 7 Chinstrap female        2900               187    288
 8 Adelie    female        2925               193     99
 9 Adelie    female        3000               185     40
10 Adelie    female        3000               192    139

You might want to check out the tutorial on using dplyr’s slice_max() to get the top n rows for a specific column.

1 comment

Comments are closed.

Exit mobile version