In this tutorial we will learn how to compute product of all elements of a column in a dataframe using tidyverse. We will use prod() function in base to multiple all elements of a columns.
Let us first load tidyverse.
library(tidyverse)
We will create a simple dataframe using tidyverse’ tibble() with a column containing numbers from 1 to 6.
df <- tibble(id = seq(6)) df # A tibble: 6 × 1 id <int> 1 1 2 2 3 3 4 4 5 5 6 6
Here we use prod() function on the column of interest to compute the product of all elements in the column. We use summarize() function from dplyr to save the result of multiplying all the elements of a column as a dataframe.
df %>% summarize(product = prod(id)) # A tibble: 1 × 1 product <dbl> 1 720
Note that, by default prod() function does not remove NAs. Let us see an example of how to handle NAs using prod() function. First let us create column with one or more NAs.
set.seed(1234) df <- tibble(id = sample(c(NA,seq(5)), 5, replace=FALSE)) df # A tibble: 5 × 1 id <int> 1 3 2 1 3 4 4 NA 5 5
Therefore, if you have a column with one or more missing values NAs, we will get NA as a result.
df %>% summarize(product = prod(id)) # A tibble: 1 × 1 product <dbl> 1 NA
To remove NAs and perform multiplication of all non-NA elements in a column, we should specify na.rm=TRUE as argument to prod() function. Now we get the expected answer multiplying all non-NAs.
df %>% summarize(product = prod(id, na.rm=TRUE)) # A tibble: 1 × 1 product <dbl> 1 60
1 comment
Comments are closed.