• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Rstats 101

Learn R Programming Tips & Tricks for Statistics and Data Science

  • Home
  • About
    • Privacy Policy
  • Show Search
Hide Search

How to Replace NA values in a dataframe with Zeros?

rstats101 · January 11, 2022 ·

In this tutorial, we will learn how to replace all NA values in dataframe with a specific value like zero in R.
How to replace Missing Values with Zeros in R
How to replace all NAs with Zeros

Create a dataframe with NA values

Let us get started with creating a dataframe with missing values,i.e. NAs in columns. We first create a vector with NAs using sample() function, where we sample a vector containing NAs – missing values with replacement.

set.seed(2020)
data <- sample(c(1:5,NA), 50, replace = TRUE)

Our data looks like this.

data

##  [1]  4  4 NA  1  1  4  2 NA  1  5  2  2 NA  5  2  3  2  5  4  2 NA NA  4 NA  4
## [26]  2  4  5  4  4  3 NA  2  2 NA  3  5  4  5  5  2  5  1 NA  3  5  1  5  3  1

Let us convert our data vector into a matrix using the matrix() function. Here we specify a matrix with 5 columns.

data_mat <- matrix(data, ncol=5)

Our matrix with missing values look like this.

head(data_mat)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    4    2   NA    3    2
## [2,]    4    2   NA   NA    5
## [3,]   NA   NA    4    2    1
## [4,]    1    5   NA    2   NA
## [5,]    1    2    4   NA    3
## [6,]    4    3    2    3    5

And then we convert the matrix into a dataframe using as.data.frame() function.

data_df<- as.data.frame(data_mat)
head(data_df)
##   V1 V2 V3 V4 V5
## 1  4  2 NA  3  2
## 2  4  2 NA NA  5
## 3 NA NA  4  2  1
## 4  1  5 NA  2 NA
## 5  1  2  4 NA  3
## 6  4  3  2  3  5

Find the locations of NA values in R using is.na() function

To replace NAs with zeroes, we need to find which indices we have NAs. We will use is.na() function to find if an element in the dataframe is NA or not.

is.na(data_df)

##          V1    V2    V3    V4    V5
##  [1,] FALSE FALSE  TRUE FALSE FALSE
##  [2,] FALSE FALSE  TRUE  TRUE FALSE
##  [3,]  TRUE  TRUE FALSE FALSE FALSE
##  [4,] FALSE FALSE  TRUE FALSE  TRUE
##  [5,] FALSE FALSE FALSE  TRUE FALSE
##  [6,] FALSE FALSE FALSE FALSE FALSE
##  [7,] FALSE FALSE FALSE FALSE FALSE
##  [8,]  TRUE FALSE FALSE FALSE FALSE
##  [9,] FALSE FALSE FALSE FALSE FALSE
## [10,] FALSE FALSE FALSE FALSE FALSE

Replace all NA values to zeros in R

is.na() function gives us boolean dataframe and we can use that to replace NAs into zeros.

data_df[is.na(data_df)] <- 0

Now our dataframe does not have any NAs, we have replaced them with zeroes.

data_df
##    V1 V2 V3 V4 V5
## 1   4  2  0  3  2
## 2   4  2  0  0  5
## 3   0  0  4  2  1
## 4   1  5  0  2  0
## 5   1  2  4  0  3
## 6   4  3  2  3  5
## 7   2  2  4  5  1
## 8   0  5  5  4  5
## 9   1  4  4  5  3
## 10  5  2  4  5  1

Replace all NA values to some specific numerical value

As you can see we can replace all NAs with some specific value. In this example, we replace all NAs with 1000.

data_df<- as.data.frame(data_mat)
data_df[is.na(data_df)] <- 1000
data_df
##      V1   V2   V3   V4   V5
## 1     4    2 1000    3    2
## 2     4    2 1000 1000    5
## 3  1000 1000    4    2    1
## 4     1    5 1000    2 1000
## 5     1    2    4 1000    3
## 6     4    3    2    3    5
## 7     2    2    4    5    1
## 8  1000    5    5    4    5
## 9     1    4    4    5    3
## 10    5    2    4    5    1

P.S. NAs are often missing values in a useful way. Before you replace all NAs into zeros or something else, one needs to make sure that is the right thing to go. The whole area of imputing missing values is active area in statistics.

Related

Filed Under: replace NA Tagged With: replace NA with Zero in R, replaces NAs in R

Primary Sidebar

Recent Posts

  • How to create a nested dataframe with lists
  • How to compute proportion with tidyverse
  • How to Compute Z-Score of Multiple Columns
  • How to drop unused level of factor variable in R
  • How to compute Z-score

Categories

%in% arrange() as.data.frame as_tibble built-in data R colSums() R cor() in R data.frame dplyr dplyr across() dplyr group_by() dplyr rename() dplyr rowwise() dplyr row_number() dplyr select() dplyr slice_max() dplyr slice_sample() drop_na R duplicated() gsub head() impute with mean values is.element() linear regression matrix() function na.omit R NAs in R near() R openxlsx pivot_longer() prod() R.version replace NA replace NAs tidyverse R Function rstats rstats101 R version scale() sessionInfo() t.test() tidyr tidyselect tidyverse write.xlsx

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version