• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Rstats 101

Learn R Programming Tips & Tricks for Statistics and Data Science

  • Home
  • About
    • Privacy Policy
  • Show Search
Hide Search

How to Convert Numerical Variable into a Categorical Variable in R

rstats101 · November 23, 2022 ·

In this tutorial, we will learn how to convert numerical or continuous variable into categorical variable. We will start with the simple example converting a numerical variable into a categorical variable with just two levels. And then we will see an example of converting a numerical variable into a categorical variable with multiple levels.

library(tidyverse)

Let us create a simple dataframe with a numerical column. Our numerical variable is exam scores ranging from 0 to 100.

set.seed(421)
df <- tibble(score= floor(runif(10, min=0, max=100)))
df

# A tibble: 10 × 1
   score
   <dbl>
 1    78
 2    14
 3    71
 4    31
 5    84
 6    69
 7    90
 8    68
 9    52
10    20

Convert a Numerical Variable in Categorical Variable with two levels

If we want to convert the numerical variable into a categorical variable with just two levels, we can use if_else() function and create the categorical variable as shown below.

df %>%
  mutate(pass=if_else(score>40, "PASS", "FAIL"))

# A tibble: 10 × 2
   score pass 
   <dbl> <chr>
 1    78 PASS 
 2    14 FAIL 
 3    71 PASS 
 4    31 FAIL 
 5    84 PASS 
 6    69 PASS 
 7    90 PASS 
 8    68 PASS 
 9    52 PASS 
10    20 FAIL 

Convert a Numerical Variable in Categorical Variable with Multiple levels

To create a categorical variable with multiple levels we use cut() function in base R. The basic use of cut() function as defined by the help page is

cut divides the range of x into intervals and codes the values in x according to which interval they fall. The leftmost interval corresponds to level one, the next leftmost to level two and so on.

For example, if we specify the breaks as 0,20,40,60,80,100, we can create 5 level categorical variable. In the example below, we use cut() to create five-level categorical variable from score.

df %>%
  mutate(grade = cut(score,
                    breaks = c(0, 20, 40, 60, 80, 100),
                    labels = c("F", "D", "C" ,"B", "A")))
# A tibble: 10 × 2
   score grade
   <dbl> <fct>
 1    78 B    
 2    14 F    
 3    71 B    
 4    31 D    
 5    84 A    
 6    69 B    
 7    90 A    
 8    68 B    
 9    52 C    
10    20 F    

Related

Filed Under: rstats101 Tagged With: numerical variable to categorical

Primary Sidebar

Recent Posts

  • How to create a nested dataframe with lists
  • How to compute proportion with tidyverse
  • How to Compute Z-Score of Multiple Columns
  • How to drop unused level of factor variable in R
  • How to compute Z-score

Categories

%in% arrange() as.data.frame as_tibble built-in data R colSums() R cor() in R data.frame dplyr dplyr across() dplyr group_by() dplyr rename() dplyr rowwise() dplyr row_number() dplyr select() dplyr slice_max() dplyr slice_sample() drop_na R duplicated() gsub head() impute with mean values is.element() linear regression matrix() function na.omit R NAs in R near() R openxlsx pivot_longer() prod() R.version replace NA replace NAs tidyverse R Function rstats rstats101 R version scale() sessionInfo() t.test() tidyr tidyselect tidyverse write.xlsx

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version