In this tutorial, we will learn how to get rows with lowest values of a column from a data frame in R. We will use dplyr’s slice_min() function to select rows with lowest values for a column in a data frame. We will also use slice_min() function in dplyr to find the bottom n rows with lowest or min values for a variable.
library(tidyverse) library(palmerpenguins)
To illustrate how slice_min() works, let us simplify our penguins data by selecting just a few columns with no missing values. We will also add row number using dplyr’s row_number() to quickly find which rows we have selected.
penguins <- penguins %>% drop_na() %>% select(species, sex, body_mass_g, flipper_length_mm) %>% mutate(row_id = row_number())
Our data looks like this.
penguins %>% head() # A tibble: 6 × 5 species sex body_mass_g flipper_length_mm row_id <fct> <fct> <int> <int> <int> 1 Adelie male 3750 181 1 2 Adelie female 3800 186 2 3 Adelie female 3250 195 3 4 Adelie female 3450 193 4 5 Adelie male 3650 190 5 6 Adelie female 3625 181 6
dplyr’s slice_min(): Get the row(s) with minimum value for a column
To find the row with the lowest value of a column, we use slice_min() with the column name and n = 1 as arguments. In our example below, we use slice_min() to get the row with lowest or body mass in our penguins data.
penguins %>% slice_min(body_mass_g, n =1) # A tibble: 1 × 5 species sex body_mass_g flipper_length_mm row_id <fct> <fct> <int> <int> <int> 1 Chinstrap female 2700 192 304
dplyr’s slice_min(): Get bottom 2 rows with minimum values for a column
With slice_min() we can the get the bottom two rows with minimum or lowest values for a column by specifying n=2 in the argument.
For example, when we use n = 2 with body mass column as arguments to slice_min(), we get the bottom two rows containing lowest body mass.
penguins %>% slice_min(body_mass_g, n = 2) # A tibble: 3 × 5 species sex body_mass_g flipper_length_mm row_id <fct> <fct> <int> <int> <int> 1 Chinstrap female 2700 192 304 2 Adelie female 2850 181 53 3 Adelie female 2850 184 59
dplyr’s slice_min(): Get bottom n rows with minimum values for a column
Similarly we can get the bottom n rows with lowest values for a column, by specifying an value for n we want.
penguins %>% slice_min(body_mass_g, n = 10) # A tibble: 10 × 5 species sex body_mass_g flipper_length_mm row_id <fct> <fct> <int> <int> <int> 1 Chinstrap female 2700 192 304 2 Adelie female 2850 181 53 3 Adelie female 2850 184 59 4 Adelie female 2900 187 49 5 Adelie female 2900 178 93 6 Adelie female 2900 188 111 7 Chinstrap female 2900 187 288 8 Adelie female 2925 193 99 9 Adelie female 3000 185 40 10 Adelie female 3000 192 139
You might want to check out the tutorial on using dplyr’s slice_max() to get the top n rows for a specific column.
1 comment
Comments are closed.