How to perform t-test in R

In this post, we will learn how top perform t test in R and understand when and why to use it. A t-test is one of the commonly used statistical tests, when one is interested in comparing two groups of measures and determine if there is a significant difference between the mean values of two groups.

We will start with an example using simulated data, where there is clear difference in the mean values between two groups. And the we will use simulated data of two groups where there is no difference in mean.

Let us load the packages needed.

library(tidyverse)
theme_set(theme_bw(16)

Applying t.test() in R: Example 1

Here we simulate two variables x and y from random normal distributions, corresponding to two groups of interest.

x <- rnorm(n=15,mean=10, sd = 1)
y <- rnorm(n=15,mean=15, sd = 1)

The variable x has about a mean of 10.

mean(x)
[1] 10.15238

The variable y has about a mean of 14.

mean(y)

[1] 14.75341

One of the most ways to use t.test() function is to provide the two group values as argument to it. Here we provide x and y vectors as arguments to t.test() function available in R to determine if the means of these two groups are different.

t_test_res <- t.test(x, y)

The resulting object shows the quick summary of the results from applying t-test.

t_test_res

    Welch Two Sample t-test

data:  x and y
t = -12.9, df = 26.34, p-value = 6.816e-13
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -5.333730 -3.868317
sample estimates:
mean of x mean of y 
 10.15238  14.75341 

We can access the results using $ notation. For example, we can get the p.value from t-test

t_test_res$p.value

[1] 6.816459e-13

The ow p-value shows that the mean difference is statistically significant. We can use broom package’s tidy() function and get all the results in a dataframe as shown below.

t_test_res |> broom::tidy()

# A tibble: 1 × 10
  estimate estimate1 estimate2 statistic  p.value parameter conf.low conf.high
     <dbl>     <dbl>     <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl>
1    -4.60      10.2      14.8     -12.9 6.82e-13      26.3    -5.33     -3.87
# ℹ 2 more variables: method <chr>, alternative <chr>

A important thing to remember while applying any statistical test is to actually visualize the data and see if the results from the test matches with the actual data. Here we use boxplot visualize the two group’s distribution. We can clearly see the two groups are distinct with different mean/median values.

tibble(group=c(rep("g1",15), rep("g2",15)), data =c(x,y) ) |>
  ggplot(aes(x=group, y=data,fill=group )) +
  geom_boxplot(outlier.shape = NA)+
  geom_jitter(width=0.1)+
  theme(legend.position = "none")
Applying t-test on two groups with clear difference in mean values
Applying t-test on two groups with clear difference in mean values

Applying t.test(): Example 2

In the previous example, we simulated two groups with different means, thus t-test correctly determining that the difference in mean values of the groups is statistically significant.

Let us simulate data where there is strong difference in mean values of the two groups. And check the results of t-test. We could see that p.value from the t-test is closer to 1 showing that the mean difference is not statistically significant.

x <- rnorm(n=15,mean=10, sd = 1)
y <- rnorm(n=15,mean=10, sd = 2)

t_test_res <- t.test(x,y)
t_test_res |> broom::tidy()

# A tibble: 1 × 10
  estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
     <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl>
1   -0.131      10.2      10.3    -0.287   0.776      23.5    -1.07     0.812
# ℹ 2 more variables: method <chr>, alternative <chr>

We can also verify this by visualizing the data as a boxplot and see that the two distribution overlap clearly with no difference in mean values.

tibble(group=c(rep("g1",15), rep("g2",15)), data =c(x,y) ) |>
  ggplot(aes(x=group, y=data,fill=group )) +
  geom_boxplot(outlier.shape = NA)+
  geom_jitter(width=0.1)+
  theme(legend.position = "none")
Applying t-test on two groups with NO clear difference in mean values

1 comment

Leave a comment

Your email address will not be published. Required fields are marked *

Exit mobile version