How to perform multiple t-tests using tidyverse

In this tutorial, we will learn how to perform multiple t-tests to determine if there is any difference in mean using tidyverse framework. With tidyverse framework, we will use tidyverse packages/functions instead of looping through using a for loop.

Let us load the packages needed.

library(tidyverse)
library(palmerpenguin)
library(broom)
theme_set(theme_bw(16)

We have tow examples of using tidyverse to perform multiple t-tests starting from a dataframe. The first example is a toy example with small number of samples and the second example is with large sample size.

We will use Palmer penguin dataset to perform t-test. First, let us subset the penguins data so that we have just 10 samples per each species.

set.seed(42)
df <- penguins |>
  drop_na() |>
  group_by(species) |>
  slice_sample(n=10) |>
  ungroup()

Our sub-sampled dataset looks like this.

df |> head()

# A tibble: 6 × 8
  species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
  <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
1 Adelie  Biscoe              34.5          18.1               187        2900
2 Adelie  Torgersen           33.5          19                 190        3600
3 Adelie  Torgersen           42.1          19.1               195        4000
4 Adelie  Torgersen           41.5          18.3               195        4300
5 Adelie  Dream               41.5          18.5               201        4000
6 Adelie  Dream               37.5          18.5               199        4475
# ℹ 2 more variables: sex <fct>, year <int>

Multiple t-tests with tidyverse: Example 1

For the t-tests, we are mainly interested in two variables from the penguins data, bill length and sex for each species. We want to perform a t-test to determine if there is a difference in bill length between the sexes for each penguin species.

To perform multiple t-tests in tidyverse framework, we will use group_by() to separate the data for each test. In this example, we group by species variable and this gives us access to each penguin species’ data.

Then we create a list column, where each element is the result of a applying t-test on bill length and sex for each species.

df |>
  group_by(species) |>
  summarize(t_test_obj = list(t.test(bill_length_mm ~ sex))) 

# A tibble: 3 × 2
  species   t_test_obj
  <fct>     <list>    
1 Adelie    <htest>   
2 Chinstrap <htest>   
3 Gentoo    <htest>

We can convert the t-test result object into nice dataframe using broom’s tidy() function.

df |>
  group_by(species) |>
  summarize(t_test_obj = list(t.test(bill_length_mm ~ sex))) |>
  mutate(ttest_res = map(t_test_obj, tidy)) 

# A tibble: 3 × 3
  species   t_test_obj ttest_res        
  <fct>     <list>     <list>           
1 Adelie    <htest>    <tibble [1 × 10]>
2 Chinstrap <htest>    <tibble [1 × 10]>
3 Gentoo    <htest>    <tibble [1 × 10]>

Then we unnest the result from broom to get the t-test result as dataframe. Here we have for each species, we have the t-test results.

df |>
  group_by(species) |>
  summarize(t_test_obj = list(t.test(bill_length_mm ~ sex))) |>
  mutate(ttest_res = map(t_test_obj, tidy)) |>
  unnest(ttest_res)

# A tibble: 3 × 12
  species   t_test_obj estimate estimate1 estimate2 statistic p.value parameter
  <fct>     <list>        <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>
1 Adelie    <htest>       -4.1       36.4      40.5     -2.58  0.0381      6.70
2 Chinstrap <htest>       -3.14      47.2      50.4     -1.96  0.0892      7.25
3 Gentoo    <htest>       -3.36      46.1      49.4     -2.32  0.0524      7.20
# ℹ 4 more variables: conf.low <dbl>, conf.high <dbl>, method <chr>,
#   alternative <chr>

By checking the value of p.value, we can see that only of the t-test has statistically significant result. And we can see that by visualizing the actual data used as a boxplot with ggplot2.

df |>
  ggplot(aes(x=sex, y=bill_length_mm, fill=sex))+
  geom_boxplot(outlier.shape = NA)+
  geom_jitter(width=0.1)+
  facet_wrap(~species)+
  theme(legend.position = "none")+
  scale_fill_brewer(palette="Dark2")+
  scale_y_continuous(breaks=scales::breaks_pretty(6))+
  labs(title="How to perform multiple t-test with tidyverse framework")
ggsave("How_to_perform_multiple_t_tests_with_tidyverse.png", width=8, height=6)

How to perform multiple t-tests with tidyverse

How to do multiple t-tests with tidyverse: Example 2

In the second example of performing multiple t-tests, we use all of the penguin dataset, instead of sample size of just 10 per each test.

We use the same approach described above to perform multiple t-tests with tidyverse framework and take a look at the p-value from each t-test. And we can see that pvalue for each test is statistically significant, suggesting a meaningful difference in mean values of bill length between the sexes in all three penguin species.

penguins |>
  drop_na() |>
  group_by(species) |>
  summarize(t_test_obj = list(t.test(bill_length_mm ~ sex))) |>
  mutate(ttest_res = map(t_test_obj, tidy)) |>
  unnest(ttest_res)

# A tibble: 3 × 12
  species   t_test_obj estimate estimate1 estimate2 statistic  p.value parameter
  <fct>     <list>        <dbl>     <dbl>     <dbl>     <dbl>    <dbl>     <dbl>
1 Adelie    <htest>       -3.13      37.3      40.4     -8.78 4.80e-15     142. 
2 Chinstrap <htest>       -4.52      46.6      51.1     -7.57 8.92e-10      48.7
3 Gentoo    <htest>       -3.91      45.6      49.5     -8.88 1.32e-14     111. 
# ℹ 4 more variables: conf.low <dbl>, conf.high <dbl>, method <chr>,
#   alternative <chr>

Visualizing the whole data using boxplots suggest the same conclusion.

penguins |>
  drop_na() |>
  ggplot(aes(x=sex, y=bill_length_mm, fill=sex))+
  geom_boxplot(outlier.shape = NA)+
  geom_jitter(width=0.1)+
  facet_wrap(~species)+
  theme(legend.position = "none")+
  scale_fill_brewer(palette="Dark2")+
  scale_y_continuous(breaks=scales::breaks_pretty(6))+
  labs(title="How to perform multiple t-test with tidyverse framework")