How to remove rows with all NAs

Remove rows with all NAs
Remove rows with all NAs

In this tutorial, we will learn how to remove rows with all values are NAs using dplyr in tidyverse. For example in the cartoon illustration below we have a dataframe with three rows and two of the rows has NAs for all elements. We will learn how to filter out the rows with all NA values using three related approaches with tidyverse.

Remove rows with all NAs

We use dplyr’s version 1.1.2.

library(tidyverse)
packageVersion("dplyr")
[1] '1.1.2'

First we will start with creating a data frame with multiple rows with NAs.

set.seed(2022)
x  <- c(4:9,rep(NA,10)) 
df <- tibble(C1= sample(x,5),
             C2= sample(x,5),
             C3= sample(x,5))

Our toy data frame has at least one NA value in every row, but only two of the five rows have all NA values.

df

# A tibble: 5 × 3
     C1    C2    C3
  <int> <int> <int>
1     7     7    NA
2     6     9    NA
3    NA    NA    NA
4    NA    NA    NA
5    NA    NA     5

Naive approach of simply using na.omit() function will not work. For our toy data frame na.omit will remove all rows as it removes rows even if it contains one NA value.

df %>% na.omit()

# A tibble: 0 × 3
# ℹ 3 variables: C1 <int>, C2 <int>, C3 <int>

Remove rows with all NAs using if_any()

One of the ways to remove rows with all NA values is to use filter() function in combination with if_any() function.

df %>%
  filter(if_any(everything(), ~!is.na(.)))

# A tibble: 3 × 3
     C1    C2    C3
  <int> <int> <int>
1     7     7    NA
2     6     9    NA
3    NA    NA     5

Remove rows with all NAs using across()

Another option is to use rowwise() function in dplyr and check for NAs each row. In the example below we use across() function to check all the columns of a given row.

df %>%
  rowwise() %>%
  filter(!all(is.na(across(everything())))) %>%
  ungroup()
# A tibble: 3 × 3
     C1    C2    C3
  <int> <int> <int>
1     7     7    NA
2     6     9    NA
3    NA    NA     5

Remove rows with all NAs using c_across()/h3>

We can also use rowwise() function in dplyr in combination with c_across() function in dplyr. The example below checks for all NAs in columns with numerical values in each row.

df %>%
  rowwise() %>%
  filter(!all(is.na(c_across(where(is.numeric)))))
# A tibble: 3 × 3
# Rowwise: 
     C1    C2    C3
  <int> <int> <int>
1     7     7    NA
2     6     9    NA
3    NA    NA     5
Exit mobile version