
First we will learn how to remove rows with missing values in a dataframe and then we will learn how to use na.omit() function to remove rows with NA in a matrix.
Create Data with missing values
Let us create a sample dataframe with some missing values. We will use data.frame() function available in base R to create a simple dataframe from scratch.
df <- data.frame(col1 = letters[1:5], col2 = c(1,2,NA,4,5), col3 = c(1:4,NA), col4 = 1:5)
In this example we have created a data frame with two rows containing missing values NA.
df ## col1 col2 col3 col4 ## 1 a 1 1 1 ## 2 b 2 2 2 ## 3 c NA 3 3 ## 4 d 4 4 4 ## 5 e 5 NA 5
Removing rows with missing values in a data frame
We can remove rows containing one or more missing values NA using na.omit() function in R. By using na.omit() function on the data frame, we get a new dataframe with three rows after removing the two rows with missing values.
na.omit(df) ## col1 col2 col3 col4 ## 1 a 1 1 1 ## 2 b 2 2 2 ## 4 d 4 4 4
Removing rows with missing values in a matrix
na.omit() in R can also be used to remove rows containing missing values NA from a matrix object. Here we create a matrix using the numerical columns of the above dataframe
data_matrix <- as.matrix(df[,2:4]) data_matrix ## col2 col3 col4 ## [1,] 1 1 1 ## [2,] 2 2 2 ## [3,] NA 3 3 ## [4,] 4 4 4 ## [5,] 5 NA 5
Our matrix has three columns and five rows, but two of the rows have missing values NA. By applying na.omit() on the matrix we will get a new matrix with no missing values in any of the rows. Basically na.omit() function, removes the two rows containing missing values.
na.omit(data_matrix) ## col2 col3 col4 ## [1,] 1 1 1 ## [2,] 2 2 2 ## [3,] 4 4 4 ## attr(,"na.action") ## [1] 3 5 ## attr(,"class") ## [1] "omit"
[…] missing values using base R function na.omit() available in stats package part of base R. Check this post to learn how to use na.omit() to remove rows with missing values in a data frame or a […]