How to Convert Numerical Variable into a Categorical Variable in R

In this tutorial, we will learn how to convert numerical or continuous variable into categorical variable. We will start with the simple example converting a numerical variable into a categorical variable with just two levels. And then we will see an example of converting a numerical variable into a categorical variable with multiple levels.

library(tidyverse)

Let us create a simple dataframe with a numerical column. Our numerical variable is exam scores ranging from 0 to 100.

set.seed(421)
df <- tibble(score= floor(runif(10, min=0, max=100)))
df

# A tibble: 10 × 1
   score
   <dbl>
 1    78
 2    14
 3    71
 4    31
 5    84
 6    69
 7    90
 8    68
 9    52
10    20

Convert a Numerical Variable in Categorical Variable with two levels

If we want to convert the numerical variable into a categorical variable with just two levels, we can use if_else() function and create the categorical variable as shown below.

df %>%
  mutate(pass=if_else(score>40, "PASS", "FAIL"))

# A tibble: 10 × 2
   score pass 
   <dbl> <chr>
 1    78 PASS 
 2    14 FAIL 
 3    71 PASS 
 4    31 FAIL 
 5    84 PASS 
 6    69 PASS 
 7    90 PASS 
 8    68 PASS 
 9    52 PASS 
10    20 FAIL 

Convert a Numerical Variable in Categorical Variable with Multiple levels

To create a categorical variable with multiple levels we use cut() function in base R. The basic use of cut() function as defined by the help page is

cut divides the range of x into intervals and codes the values in x according to which interval they fall. The leftmost interval corresponds to level one, the next leftmost to level two and so on.

For example, if we specify the breaks as 0,20,40,60,80,100, we can create 5 level categorical variable. In the example below, we use cut() to create five-level categorical variable from score.

df %>%
  mutate(grade = cut(score,
                    breaks = c(0, 20, 40, 60, 80, 100),
                    labels = c("F", "D", "C" ,"B", "A")))
# A tibble: 10 × 2
   score grade
   <dbl> <fct>
 1    78 B    
 2    14 F    
 3    71 B    
 4    31 D    
 5    84 A    
 6    69 B    
 7    90 A    
 8    68 B    
 9    52 C    
10    20 F    
Exit mobile version