In this tutorial, we will learn how to create new columns and remove existing columns using dplyr’s transmute() function. We will start with an example showing how to use transmute to create one new column and then show an example of creating more than one new column.
First, let us load tidyverse, the suite of R packages.
library(tidyverse) packageVersion("dplyr") ## [1] '1.0.9'
Let us create a simple dataframe using tibble() from scratch. Our toy dataframe has three columns and three rows.
df <- tibble(species = c("Adelie", "Chinstrap", "Gentoo"), body_mass = c(3700, 3733, 5076), bill_length = c(39,49,48)) df ## # A tibble: 3 × 3 ## species body_mass bill_length ## <chr> <dbl> <dbl> ## 1 Adelie 3700 39 ## 2 Chinstrap 3733 49 ## 3 Gentoo 5076 48
dplyr transmute() example: creating a new column
In the example below we use transmute() function to create a new column from the existing column. The resulting dataframe contains only the new column we created.
df %>% transmute(body_mass_kg = body_mass/1000) ## # A tibble: 3 × 1 ## body_mass_kg ## <dbl> ## 1 3.7 ## 2 3.73 ## 3 5.08
dplyr transmute() example: creating multiple columns
Here is an example of creating multiple new columns using dplyr’s transmute(). Basically, we will use a single transmute() function and create as many columns as needed by separating each new columbn by comma.
As mentioned the resulting dataframe contains only the new columns.
df %>% transmute(body_mass_kg = body_mass/1000, bill_length_m = bill_length/1000) ## # A tibble: 3 × 2 ## body_mass_kg bill_length_m ## <dbl> <dbl> ## 1 3.7 0.039 ## 2 3.73 0.049 ## 3 5.08 0.048
Difference between dplyr’s mutate() and transmute() function
dplyr’s mutate() and transmute() functions are similar in nature but with a big difference. As you can see transmute() function creates new columns and also deletes existing columns. However, dplyr’s mutate() function creates new columns and it keeps all the existing columns.