In this tutorial, we will learn how to compute proportion with tidyverse. We will see three examples of caluculating proportion. In the first, we have counts of a single column and show how to calculate proportion. The second example shows how to compute proportion of variable resulting from combination of two other variables.
library(palmerpenguins) library(tidyverse) packageVersion("dplyr") [1] '1.1.4'
We will use Palmer penguin dataset to compute proportion.
penguins <- penguins |> drop_na()
First we will show proportion of a single variable, species. Here we have counts of the three species of penguins in the data.
penguins |> count(species) # A tibble: 3 × 2 species n <fct> <int> 1 Adelie 146 2 Chinstrap 68 3 Gentoo 119
To compute proportion, we first count the number for each species and then use mutate() function with n to compute the proportion.
penguins |> count(species) |> mutate(prop = n/sum(n)) # A tibble: 3 × 3 species n prop <fct> <int> <dbl> 1 Adelie 146 0.438 2 Chinstrap 68 0.204 3 Gentoo 119 0.357
To compute proportion of variable generated from two other categorical variable, we will first use count() on the two categorical variables to get the counts for each combination and then use mutate as before to compute the proportion.
penguins |> count(species, sex) |> mutate(prop = n/sum(n)) # A tibble: 6 × 4 species sex n prop <fct> <fct> <int> <dbl> 1 Adelie female 73 0.219 2 Adelie male 73 0.219 3 Chinstrap female 34 0.102 4 Chinstrap male 34 0.102 5 Gentoo female 58 0.174 6 Gentoo male 61 0.183
Compute proportion within groups
In this example, we show how to compute proportion within multiple groups, i.e. proportion of male/female with in each species.
penguins |> count(species,sex) # A tibble: 6 × 3 species sex n <fct> <fct> <int> 1 Adelie female 73 2 Adelie male 73 3 Chinstrap female 34 4 Chinstrap male 34 5 Gentoo female 58 6 Gentoo male 61
penguins |> count(species,sex) |> group_by(species) |> mutate(proportion = n / sum(n)) # A tibble: 6 × 4 # Groups: species [3] species sex n proportion <fct> <fct> <int> <dbl> 1 Adelie female 73 0.5 2 Adelie male 73 0.5 3 Chinstrap female 34 0.5 4 Chinstrap male 34 0.5 5 Gentoo female 58 0.487 6 Gentoo male 61 0.513