• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Rstats 101

Learn R Programming Tips & Tricks for Statistics and Data Science

  • Home
  • About
    • Privacy Policy
  • Show Search
Hide Search

Join dataframes by different column names with dplyr

rstats101 · October 7, 2023 ·

In this tutorial, we will learn how to join dataframes by columns with different names in the two dataframes.

dplyr offers a great set of tools to join or merge dataframes. We will use two approaches to merge two dataframe by different names.

Join dataframes by different column names with dplyr
Join dataframes by different column names with dplyr

We will start with loading tidyverse.

library(tidyverse)

Then we will create two toy dataframes such that the columns we would like to join them have different names.

Our first toy dataframe has two columns.

penguins1 <- tibble(species = c("Gentoo","Adelie", "Chinstrap"),
                    sex = c("male", "female", "male"))
penguins1

## # A tibble: 3 × 2
##   species   sex   
##   <chr>     <chr> 
## 1 Gentoo    male  
## 2 Adelie    female
## 3 Chinstrap male

And the toy dataframe has two columns. And the first column in the second dataframe refers to the same variable in the first dataframe, but with different name.

penguins2 <- tibble(penguin = c("Gentoo","Adelie", "Chinstrap"),
                    body_mass = c(5700, 4500, 3900))
penguins2

## # A tibble: 3 × 2
##   penguin   body_mass
##   <chr>         <dbl>
## 1 Gentoo         5700
## 2 Adelie         4500
## 3 Chinstrap      3900

Joining dataframes with different column names with “by =c()”

One of the ways to join or merge two dataframes using columns that have different names is to use by argument with c(). Within c(), we specify the name in the first dataframe equals the name in the second dataframe. Note that the column names are provided as a string, i.e. within double quotes.

penguins1 %>%
  left_join(penguins2, by = c("species"="penguin"))

## # A tibble: 3 × 3
##   species   sex    body_mass
##   <chr>     <chr>      <dbl>
## 1 Gentoo    male        5700
## 2 Adelie    female      4500
## 3 Chinstrap male        3900

Joining dataframes with different column names using “join_by”

Another way to join or merge two dataframes using columns that have different names is to use join_by() by. Within join_by(), we specify the name in the first dataframe and the name in the second dataframe with equality == symbol. Note that the column names are provided as a variable i.e. without double quotes.

penguins1 %>%
  full_join(penguins2, by = join_by(species==penguin))

## # A tibble: 3 × 3
##   species   sex    body_mass
##   <chr>     <chr>      <dbl>
## 1 Gentoo    male        5700
## 2 Adelie    female      4500
## 3 Chinstrap male        3900

In the above examples we showed how to join dataframes by different column names using left_join()/full_join(). We could also use other joins in dplyr with different names.

Related

Filed Under: dplyr left_join(), rstats101 Tagged With: join dataframes by different column names dplyr

Reader Interactions

Trackbacks

  1. How to do inner join with dplyr in R - Rstats 101 says:
    October 7, 2023 at 1:39 pm

    […] are joining. Often you might want to inner join two dataframes with two different column names. See this post to learn how to join two dataframes with different column […]

Primary Sidebar

Recent Posts

  • How to create a nested dataframe with lists
  • How to compute proportion with tidyverse
  • How to Compute Z-Score of Multiple Columns
  • How to drop unused level of factor variable in R
  • How to compute Z-score

Categories

%in% arrange() as.data.frame as_tibble built-in data R colSums() R cor() in R data.frame dplyr dplyr across() dplyr group_by() dplyr rename() dplyr rowwise() dplyr row_number() dplyr select() dplyr slice_max() dplyr slice_sample() drop_na R duplicated() gsub head() impute with mean values is.element() linear regression matrix() function na.omit R NAs in R near() R openxlsx pivot_longer() prod() R.version replace NA replace NAs tidyverse R Function rstats rstats101 R version scale() sessionInfo() t.test() tidyr tidyselect tidyverse write.xlsx

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version