In this tutorial, we will learn how to select columns that ends with a string with multiple examples using dplyr and base R. With dplyr, we can use ends_with(), one of the select helper functions, to select columns that ends with a prefix/string. Similarly, in base R we have endsWith() function to help select column ends with a string.
To get started with some examples, let us load tidyverse and palmerpenguins package.
library(tidyvrerse) library(palmerpenguins) packageVersion("dplyr") ## [1] '1.0.9'
penguins %>% head(5) ## # A tibble: 5 × 8 ## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex ## <fct> <fct> <dbl> <dbl> <int> <int> <fct> ## 1 Adelie Torge… 39.1 18.7 181 3750 male ## 2 Adelie Torge… 39.5 17.4 186 3800 fema… ## 3 Adelie Torge… 40.3 18 195 3250 fema… ## 4 Adelie Torge… NA NA NA NA <NA> ## 5 Adelie Torge… 36.7 19.3 193 3450 fema… ## # … with 1 more variable: year <int>
dplyr ends_with() to select column ending with a letter
To select columns whose names end with a letter we can use dplyr’s ends_with() function with the letter to end with as argument.
penguins %>% select(ends_with("g")) ## # A tibble: 344 × 1 ## body_mass_g ## <int> ## 1 3750 ## 2 3800 ## 3 3250 ## 4 NA ## 5 3450 ## 6 3650 ## 7 3625 ## 8 4675 ## 9 3475 ## 10 4250 ## # … with 334 more rows
dplyr ends_with() to select column ending with a suffix
Similarly to select columns ending with a suffix we use the suffix as argument to ends_with() function.
penguins %>% select(ends_with("mm")) ## # A tibble: 344 × 3 ## bill_length_mm bill_depth_mm flipper_length_mm ## <dbl> <dbl> <int> ## 1 39.1 18.7 181 ## 2 39.5 17.4 186 ## 3 40.3 18 195 ## 4 NA NA NA ## 5 36.7 19.3 193 ## 6 39.3 20.6 190 ## 7 38.9 17.8 181 ## 8 39.2 19.6 195 ## 9 34.1 18.1 193 ## 10 42 20.2 190 ## # … with 334 more rows
base R’s endsWith() to select columns ending with a letter
We can also use baseR’s endsWith() function to select columns ending with a letter. To endsWith() function we need to orovide the column names and the letter and it will return us boolean vector as output.
endsWith(colnames(penguins), "g") ## [1] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
In order to select the columns, we need to subset the dataframe using the boolean vector. In our example of selecting columns ending with a letter “g”, we get a dataframe with one column.
penguins[, endsWith(colnames(penguins), "g")] ## # A tibble: 344 × 1 ## body_mass_g ## <int> ## 1 3750 ## 2 3800 ## 3 3250 ## 4 NA ## 5 3450 ## 6 3650 ## 7 3625 ## 8 4675 ## 9 3475 ## 10 4250 ## # … with 334 more rows
base R’s endsWith() to select columns ending with a suffix
Here is an example of endsWith() function in base R showing how to select columns that end with a suffix. As in the previous example, we first get a boolean vector and then subset the dataframe using the boolean vector.
endsWith(colnames(penguins), "mm") ## [1] FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE
penguins[,endsWith(colnames(penguins), "mm")] ## # A tibble: 344 × 3 ## bill_length_mm bill_depth_mm flipper_length_mm ## <dbl> <dbl> <int> ## 1 39.1 18.7 181 ## 2 39.5 17.4 186 ## 3 40.3 18 195 ## 4 NA NA NA ## 5 36.7 19.3 193 ## 6 39.3 20.6 190 ## 7 38.9 17.8 181 ## 8 39.2 19.6 195 ## 9 34.1 18.1 193 ## 10 42 20.2 190 ## # … with 334 more rows
Sometimes you might want to select columns that does not end with a suffix. To select the columns that do not end with a suffix we use negation operator in front of ends_with() or endsWith() function.