dplyr ends_with(): select columns that end with a suffix

In this tutorial, we will learn how to select columns that ends with a string with multiple examples using dplyr and base R. With dplyr, we can use ends_with(), one of the select helper functions, to select columns that ends with a prefix/string. Similarly, in base R we have endsWith() function to help select column ends with a string.

To get started with some examples, let us load tidyverse and palmerpenguins package.

library(tidyvrerse)
library(palmerpenguins)
packageVersion("dplyr")

## [1] '1.0.9'
penguins %>% head(5)

## # A tibble: 5 × 8
##   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
##   <fct>   <fct>           <dbl>         <dbl>            <int>       <int> <fct>
## 1 Adelie  Torge…           39.1          18.7              181        3750 male 
## 2 Adelie  Torge…           39.5          17.4              186        3800 fema…
## 3 Adelie  Torge…           40.3          18                195        3250 fema…
## 4 Adelie  Torge…           NA            NA                 NA          NA <NA> 
## 5 Adelie  Torge…           36.7          19.3              193        3450 fema…
## # … with 1 more variable: year <int>

dplyr ends_with() to select column ending with a letter

To select columns whose names end with a letter we can use dplyr’s ends_with() function with the letter to end with as argument.

penguins %>%
  select(ends_with("g"))

## # A tibble: 344 × 1
##    body_mass_g
##          <int>
##  1        3750
##  2        3800
##  3        3250
##  4          NA
##  5        3450
##  6        3650
##  7        3625
##  8        4675
##  9        3475
## 10        4250
## # … with 334 more rows

dplyr ends_with() to select column ending with a suffix

Similarly to select columns ending with a suffix we use the suffix as argument to ends_with() function.

penguins %>%
  select(ends_with("mm"))

## # A tibble: 344 × 3
##    bill_length_mm bill_depth_mm flipper_length_mm
##             <dbl>         <dbl>             <int>
##  1           39.1          18.7               181
##  2           39.5          17.4               186
##  3           40.3          18                 195
##  4           NA            NA                  NA
##  5           36.7          19.3               193
##  6           39.3          20.6               190
##  7           38.9          17.8               181
##  8           39.2          19.6               195
##  9           34.1          18.1               193
## 10           42            20.2               190
## # … with 334 more rows

base R’s endsWith() to select columns ending with a letter

We can also use baseR’s endsWith() function to select columns ending with a letter. To endsWith() function we need to orovide the column names and the letter and it will return us boolean vector as output.

endsWith(colnames(penguins), "g")
## [1] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE

In order to select the columns, we need to subset the dataframe using the boolean vector. In our example of selecting columns ending with a letter “g”, we get a dataframe with one column.

penguins[, endsWith(colnames(penguins), "g")]

## # A tibble: 344 × 1
##    body_mass_g
##          <int>
##  1        3750
##  2        3800
##  3        3250
##  4          NA
##  5        3450
##  6        3650
##  7        3625
##  8        4675
##  9        3475
## 10        4250
## # … with 334 more rows

base R’s endsWith() to select columns ending with a suffix

Here is an example of endsWith() function in base R showing how to select columns that end with a suffix. As in the previous example, we first get a boolean vector and then subset the dataframe using the boolean vector.

endsWith(colnames(penguins), "mm")
## [1] FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE
penguins[,endsWith(colnames(penguins), "mm")]

## # A tibble: 344 × 3
##    bill_length_mm bill_depth_mm flipper_length_mm
##             <dbl>         <dbl>             <int>
##  1           39.1          18.7               181
##  2           39.5          17.4               186
##  3           40.3          18                 195
##  4           NA            NA                  NA
##  5           36.7          19.3               193
##  6           39.3          20.6               190
##  7           38.9          17.8               181
##  8           39.2          19.6               195
##  9           34.1          18.1               193
## 10           42            20.2               190
## # … with 334 more rows

Sometimes you might want to select columns that does not end with a suffix. To select the columns that do not end with a suffix we use negation operator in front of ends_with() or endsWith() function.

Exit mobile version