• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Rstats 101

Learn R Programming Tips & Tricks for Statistics and Data Science

  • Home
  • About
    • Privacy Policy
  • Show Search
Hide Search

How to sum a column by group in R

rstats101 · April 17, 2022 ·

In this tutorial, we will learn how to compute the sum of a column by groups in another column in a dataframe. Basic, idea is to group the dataframe by the grouping variable/column and then find the sum for each group.
How to sum a column by group in R
How to sum a column by group in R

Let us get started by loading tidyverse suite of R packages. We will be using dplyr’s group_by() function and summarize() functions to find the sum/total of a variable by group.
library(tidyverse)

We will create a simple dataframe with two columns, where one is a grouping variable and the other is numerical variable. We use tibble() to create the dataframe from scratch, mainly using sample() function to creating the two variables.

set.seed(41)
df <- tibble(grp = sample(c("g1","g2"), 8, 
                          replace=TRUE),
         counts = sample(1:20,8) )

Our dataframe looks like this. Our grouping variable “grp” has two unique values and we will be computing the sum for both the values using group_by() and summarize() functions from dplyr.

df

## # A tibble: 8 × 2
##   grp   counts
##   <chr>  <int>
## 1 g2         6
## 2 g1        18
## 3 g1         2
## 4 g2        13
## 5 g2        17
## 6 g2        19
## 7 g2         5
## 8 g1        20

Computing sum of column in a dataframe based on a grouping column in R

dplyr’s group_by() function allows use to split the dataframe into smaller dataframes based on a variable of interest. The result after group_by() has all the elements of original dataframe, but with grouping information.

df %>%
  group_by(grp)

## # A tibble: 8 × 2
## # Groups:   grp [2]
##   grp   counts
##   <chr>  <int>
## 1 g2         6
## 2 g1        18
## 3 g1         2
## 4 g2        13
## 5 g2        17
## 6 g2        19
## 7 g2         5
## 8 g1        20

Then, we can use summarize() function to compute the sum of each grouping variable.

df %>%
  group_by(grp) %>%
  summarize(total = sum(counts))

## # A tibble: 2 × 2
##   grp   total
##   <chr> <int>
## 1 g1       40
## 2 g2       60

Related

Filed Under: dplyr group_by(), dplyr summarize() Tagged With: Calculate sum by group in R, Sum a column by group dplyr

Primary Sidebar

Recent Posts

  • How to create a nested dataframe with lists
  • How to compute proportion with tidyverse
  • How to Compute Z-Score of Multiple Columns
  • How to drop unused level of factor variable in R
  • How to compute Z-score

Categories

%in% arrange() as.data.frame as_tibble built-in data R colSums() R cor() in R data.frame dplyr dplyr across() dplyr group_by() dplyr rename() dplyr rowwise() dplyr row_number() dplyr select() dplyr slice_max() dplyr slice_sample() drop_na R duplicated() gsub head() impute with mean values is.element() linear regression matrix() function na.omit R NAs in R near() R openxlsx pivot_longer() prod() R.version replace NA replace NAs tidyverse R Function rstats rstats101 R version scale() sessionInfo() t.test() tidyr tidyselect tidyverse write.xlsx

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version