
We use dplyr’s version 1.1.2.
library(tidyverse) packageVersion("dplyr") [1] '1.1.2'
First we will start with creating a data frame with multiple rows with NAs.
set.seed(2022) x <- c(4:9,rep(NA,10)) df <- tibble(C1= sample(x,5), C2= sample(x,5), C3= sample(x,5))
Our toy data frame has at least one NA value in every row, but only two of the five rows have all NA values.
df # A tibble: 5 × 3 C1 C2 C3 <int> <int> <int> 1 7 7 NA 2 6 9 NA 3 NA NA NA 4 NA NA NA 5 NA NA 5
Naive approach of simply using na.omit() function will not work. For our toy data frame na.omit will remove all rows as it removes rows even if it contains one NA value.
df %>% na.omit() # A tibble: 0 × 3 # ℹ 3 variables: C1 <int>, C2 <int>, C3 <int>
Remove rows with all NAs using if_any()
One of the ways to remove rows with all NA values is to use filter() function in combination with if_any() function.
df %>% filter(if_any(everything(), ~!is.na(.))) # A tibble: 3 × 3 C1 C2 C3 <int> <int> <int> 1 7 7 NA 2 6 9 NA 3 NA NA 5
Remove rows with all NAs using across()
Another option is to use rowwise() function in dplyr and check for NAs each row. In the example below we use across() function to check all the columns of a given row.
df %>% rowwise() %>% filter(!all(is.na(across(everything())))) %>% ungroup()
# A tibble: 3 × 3 C1 C2 C3 <int> <int> <int> 1 7 7 NA 2 6 9 NA 3 NA NA 5
Remove rows with all NAs using c_across()/h3>
We can also use rowwise() function in dplyr in combination with c_across() function in dplyr. The example below checks for all NAs in columns with numerical values in each row.
df %>% rowwise() %>% filter(!all(is.na(c_across(where(is.numeric)))))
# A tibble: 3 × 3 # Rowwise: C1 C2 C3 <int> <int> <int> 1 7 7 NA 2 6 9 NA 3 NA NA 5