• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Rstats 101

Learn R Programming Tips & Tricks for Statistics and Data Science

  • Home
  • About
    • Privacy Policy
  • Show Search
Hide Search

How to Compute row means

rstats101 · February 24, 2023 ·

In this tutorial, we will learn how to compute means of rows with tidyverse using dplyr package. We will see multiple examples to compute row means with dplyr. Wee will start with 3. examples of computing row means using rowMeans() and dplyr’s row-wise operations on a dataframe with no missing values. And then we will see two examples using rowMeans() and row-wise operation on dataframe with missing values.

Here we load tidyverse meta package. and check the version of dplyr used here.

library(tidyverse)
# check package version
packageVersion("dplyr")

## [1] '1.1.0'

Creating data for computing row means with dplyr

Firs, let us create some toy dataframe with no missing values using sample() function. We first create some random data vector, then reshape it into a matrix and convert to a dataframe using as_tibble() function in tidyverse.

set.seed(2023)
# cread random data
data <- sample(c(1:6), 20, replace = TRUE)
# create a matrix
data_mat <- matrix(data, ncol=4)
# convert the matrix as dataframe
data_df<- as_tibble(data_mat)

Our dataframe looks like this.

data_df %>% head()
## # A tibble: 5 × 4
##      V1    V2    V3    V4
##   <int> <int> <int> <int>
## 1     5     2     1     5
## 2     1     1     5     4
## 3     3     1     5     5
## 4     2     5     2     1
## 5     4     1     3     1

Row means with dplyr using rowMeans() and across() with tidy selection

We compute mean for each row using rowMeans() function in base R in combination with across() to apply across multiple columns. We select the columns of interest using tidy select function starts_with().

data_df %>%
  mutate(rmean = rowMeans(across(starts_with("V"))))

## # A tibble: 5 × 5
##      V1    V2    V3    V4 rmean
##   <int> <int> <int> <int> <dbl>
## 1     5     2     1     5  3.25
## 2     1     1     5     4  2.75
## 3     3     1     5     5  3.5 
## 4     2     5     2     1  2.5 
## 5     4     1     3     1  2.25

Row means with dplyr using rowMeans() and pick() with tidy selection

In this example, we compute mean for each row using rowMeans() function in base R in combination with across() to apply across multiple column. We use dplyr’s new function pick() to select the columns of interest using tidy select function starts_with().

data_df %>%
  mutate(rmean = rowMeans(pick(starts_with("V"))))

## # A tibble: 5 × 5
##      V1    V2    V3    V4 rmean
##   <int> <int> <int> <int> <dbl>
## 1     5     2     1     5  3.25
## 2     1     1     5     4  2.75
## 3     3     1     5     5  3.5 
## 4     2     5     2     1  2.5 
## 5     4     1     3     1  2.25

Row means with using rowwise() function in dplyr

Another way we can compute row means with dplyr is to use row-wise operation in dplyr. To perform row-wise operation, dplyr has rowwise() function.

data_df %>%
  rowwise()

## # A tibble: 5 × 4
## # Rowwise: 
##      V1    V2    V3    V4
##   <int> <int> <int> <int>
## 1     5     2     1     5
## 2     1     1     5     4
## 3     3     1     5     5
## 4     2     5     2     1
## 5     4     1     3     1

First, we apply rowwise() function and then use mutate function to compute mean using c_across() function with some tidy select function. In this example, we use starts_with() to select the columns of interest.

data_df %>%
  rowwise() %>%
  mutate(rmean = mean(c_across(starts_with("V"))))

## # A tibble: 5 × 5
## # Rowwise: 
##      V1    V2    V3    V4 rmean
##   <int> <int> <int> <int> <dbl>
## 1     5     2     1     5  3.25
## 2     1     1     5     4  2.75
## 3     3     1     5     5  3.5 
## 4     2     5     2     1  2.5 
## 5     4     1     3     1  2.25

Row means with using rowwise() function in dplyr: Example 2

In the example below using rowwise() function in dplyr, we use start and end column names to select the columns of interest.

data_df %>%
  rowwise() %>%
  mutate(rmean = mean(c_across(V1:V4)))

## # A tibble: 5 × 5
## # Rowwise: 
##      V1    V2    V3    V4 rmean
##   <int> <int> <int> <int> <dbl>
## 1     5     2     1     5  3.25
## 2     1     1     5     4  2.75
## 3     3     1     5     5  3.5 
## 4     2     5     2     1  2.5 
## 5     4     1     3     1  2.25

Row means with using rowwise() function in dplyr: Example 3

We can also use other tidy tidy select function to select columns of interest. Here is example where use all numerical columns to compute row-wise meean.

data_df %>%
  rowwise() %>%
  mutate(rmean = mean(c_across(where(is.numeric))))

## # A tibble: 5 × 5
## # Rowwise: 
##      V1    V2    V3    V4 rmean
##   <int> <int> <int> <int> <dbl>
## 1     5     2     1     5  3.25
## 2     1     1     5     4  2.75
## 3     3     1     5     5  3.5 
## 4     2     5     2     1  2.5 
## 5     4     1     3     1  2.25

Row means with using rowwise() function in dplyr on dataframe with NAs

When you have NAs, i.e. missing values in the rows, both rowMeans() function and mean() function would result NA as the mean as they don’t remove NA before computing mean.

Here is an example showing the default behaviour of computing row means.

data <- sample(c(1:5, NA), 40, replace = TRUE)
data_mat <- matrix(data, ncol=4)
# convert the matrix as dataframe
data_df<- as_tibble(data_mat)
data_df %>% head()
## # A tibble: 6 × 4
##      V1    V2    V3    V4
##   <int> <int> <int> <int>
## 1    NA     5     3     5
## 2     2     4     4     2
## 3    NA    NA    NA     3
## 4    NA     1     1     1
## 5     5    NA     5     4
## 6     1     4    NA     2

data_df %>%
  mutate(rmean = rowMeans(across(starts_with("V"))))

## # A tibble: 10 × 5
##       V1    V2    V3    V4 rmean
##    <int> <int> <int> <int> <dbl>
##  1    NA     5     3     5    NA
##  2     2     4     4     2     3
##  3    NA    NA    NA     3    NA
##  4    NA     1     1     1    NA
##  5     5    NA     5     4    NA
##  6     1     4    NA     2    NA
##  7     2    NA     2     5    NA
##  8    NA    NA     4     2    NA
##  9    NA     2     4     4    NA
## 10     1     2     1     4     2

With the use of na.rm=TRUE we get the row means that we intended to get. In the example below we use na.rm argument to rowMeans() function.

data_df %>%
  mutate(rmean = rowMeans(across(starts_with("V")), na.rm=TRUE))

## # A tibble: 10 × 5
##       V1    V2    V3    V4 rmean
##    <int> <int> <int> <int> <dbl>
##  1    NA     5     3     5  4.33
##  2     2     4     4     2  3   
##  3    NA    NA    NA     3  3   
##  4    NA     1     1     1  1   
##  5     5    NA     5     4  4.67
##  6     1     4    NA     2  2.33
##  7     2    NA     2     5  3   
##  8    NA    NA     4     2  3   
##  9    NA     2     4     4  3.33
## 10     1     2     1     4  2

In this example below we use na.rm =TRUE argument to mean() function with row-wise operation using rowwise() function.

data_df %>%
  rowwise() %>%
  mutate(rmean = mean(c_across(where(is.numeric)), na.rm=TRUE))
## # A tibble: 10 × 5
## # Rowwise: 
##       V1    V2    V3    V4 rmean
##    <int> <int> <int> <int> <dbl>
##  1    NA     5     3     5  4.33
##  2     2     4     4     2  3   
##  3    NA    NA    NA     3  3   
##  4    NA     1     1     1  1   
##  5     5    NA     5     4  4.67
##  6     1     4    NA     2  2.33
##  7     2    NA     2     5  3   
##  8    NA    NA     4     2  3   
##  9    NA     2     4     4  3.33
## 10     1     2     1     4  2

Related

Filed Under: R Function, rstats101 Tagged With: row means with dplyr, row wise mean tidyverse

Primary Sidebar

Recent Posts

  • How to create a nested dataframe with lists
  • How to compute proportion with tidyverse
  • How to Compute Z-Score of Multiple Columns
  • How to drop unused level of factor variable in R
  • How to compute Z-score

Categories

%in% arrange() as.data.frame as_tibble built-in data R colSums() R cor() in R data.frame dplyr dplyr across() dplyr group_by() dplyr rename() dplyr rowwise() dplyr row_number() dplyr select() dplyr slice_max() dplyr slice_sample() drop_na R duplicated() gsub head() impute with mean values is.element() linear regression matrix() function na.omit R NAs in R near() R openxlsx pivot_longer() prod() R.version replace NA replace NAs tidyverse R Function rstats rstats101 R version scale() sessionInfo() t.test() tidyr tidyselect tidyverse write.xlsx

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version