• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Rstats 101

Learn R Programming Tips & Tricks for Statistics and Data Science

  • Home
  • About
    • Privacy Policy
  • Show Search
Hide Search

How to compute proportion with tidyverse

rstats101 · November 21, 2024 ·

In this tutorial, we will learn how to compute proportion with tidyverse. We will see three examples of caluculating proportion. In the first, we have counts of a single column and show how to calculate proportion. The second example shows how to compute proportion of variable resulting from combination of two other variables.

library(palmerpenguins)
library(tidyverse)
packageVersion("dplyr")
[1] '1.1.4'

We will use Palmer penguin dataset to compute proportion.

penguins <- 
  penguins |>
  drop_na()

First we will show proportion of a single variable, species. Here we have counts of the three species of penguins in the data.

penguins |>
  count(species)

# A tibble: 3 × 2
  species       n
  <fct>     <int>
1 Adelie      146
2 Chinstrap    68
3 Gentoo      119

To compute proportion, we first count the number for each species and then use mutate() function with n to compute the proportion.

penguins |>
  count(species) |>
  mutate(prop = n/sum(n))

# A tibble: 3 × 3
  species       n  prop
  <fct>     <int> <dbl>
1 Adelie      146 0.438
2 Chinstrap    68 0.204
3 Gentoo      119 0.357

To compute proportion of variable generated from two other categorical variable, we will first use count() on the two categorical variables to get the counts for each combination and then use mutate as before to compute the proportion.

penguins |>
  count(species, sex) |>
  mutate(prop = n/sum(n))

# A tibble: 6 × 4
  species   sex        n  prop
  <fct>     <fct>  <int> <dbl>
1 Adelie    female    73 0.219
2 Adelie    male      73 0.219
3 Chinstrap female    34 0.102
4 Chinstrap male      34 0.102
5 Gentoo    female    58 0.174
6 Gentoo    male      61 0.183

Compute proportion within groups

In this example, we show how to compute proportion within multiple groups, i.e. proportion of male/female with in each species.

penguins |>
  count(species,sex) 

# A tibble: 6 × 3
  species   sex        n
  <fct>     <fct>  <int>
1 Adelie    female    73
2 Adelie    male      73
3 Chinstrap female    34
4 Chinstrap male      34
5 Gentoo    female    58
6 Gentoo    male      61
penguins |>
  count(species,sex) |>
  group_by(species) |>
  mutate(proportion = n / sum(n))

# A tibble: 6 × 4
# Groups:   species [3]
  species   sex        n proportion
  <fct>     <fct>  <int>      <dbl>
1 Adelie    female    73      0.5  
2 Adelie    male      73      0.5  
3 Chinstrap female    34      0.5  
4 Chinstrap male      34      0.5  
5 Gentoo    female    58      0.487
6 Gentoo    male      61      0.513

Related

Filed Under: rstats101 Tagged With: compute proportion with dplyr

Primary Sidebar

Recent Posts

  • How to create a nested dataframe with lists
  • How to compute proportion with tidyverse
  • How to Compute Z-Score of Multiple Columns
  • How to drop unused level of factor variable in R
  • How to compute Z-score

Categories

%in% arrange() as.data.frame as_tibble built-in data R colSums() R cor() in R data.frame dplyr dplyr across() dplyr group_by() dplyr rename() dplyr rowwise() dplyr row_number() dplyr select() dplyr slice_max() dplyr slice_sample() drop_na R duplicated() gsub head() impute with mean values is.element() linear regression matrix() function na.omit R NAs in R near() R openxlsx pivot_longer() prod() R.version replace NA replace NAs tidyverse R Function rstats rstats101 R version scale() sessionInfo() t.test() tidyr tidyselect tidyverse write.xlsx

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version