-1

I want to use the age variable in my data to create an 'age category' variable with different age brackets. This is the code I ran

        New_Data <- New_Data %>% mutate(Age_category = case_when(New_Data$Age <1 ~ "<1",
        New_Data$Age >=1 & New_Data$Age <=4 ~ "1-4",
        New_Data$Age >4  & New_Data$Age <=9 ~ "5-9",
        New_Data$Age >9  & New_Data$Age <=14 ~ "10-14",
        New_Data$Age >14 & New_Data$Age <=19 ~ "15-19",
        New_Data$Age >19 & New_Data$Age <=24 ~ "20-24",
        New_Data$Age >24 & New_Data$Age <=29 ~ "25-29",
        New_Data$Age >29 & New_Data$Age <=34 ~ "30-34",
        New_Data$Age >34 & New_Data$Age <=39 ~ "35-39",
        New_Data$Age >39 & New_Data$Age <=44 ~ "40-44",
        New_Data$Age >44 & New_Data$Age <=49 ~ "45-49",
        New_Data$Age >49 & New_Data$Age <=54 ~ "50-54",
        New_Data$Age >54 & New_Data$Age <=59 ~ "55-59",
        New_Data$Age >59 & New_Data$Age <=64 ~ "60-64",
        New_Data$Age >64 ~ "65+",
        New_Data$Age == "NULL" ~ "NULL"))

I expect the following in the age category column

 "<1", "1-4","5-9", "10-14", "15-19","20-24", "25-29", "30-34",  "35-39","40-44", "45-49", "50-54", "55-59",
      "60-64", "65+","NULL"

Although the age category variable is created successfully, it only contains three age distinct age brackets("5-9" "<1" "1-4" "65+"). I don't understand why the others haven't been created yet their ages exist in the data

Seth
  • 1,659
  • 1
  • 4
  • 11

1 Answers1

1

If you ever find yourself using this many clauses in a case_when, you should ask yourself if something simpler exists. In your case, it would save a lot of coding and debugging if you use cut instead:

New_Data %>%
  mutate(Age_category = cut(Age, 
                            breaks = c(0, 0.5, seq(4.5, 64.5, 5), 100), 
                            labels = c("<1", "1-4", "5-9", "10-14", "15-19",
                                       "20-24", "25-29", "30-34", "35-39",
                                       "40-44", "45-49",  "50-54", "55-59",
                                       "60-64", "65+")))
#>    Age Age_category
#> 1   68          65+
#> 2   39        35-39
#> 3    1          1-4
#> 4   34        30-34
#> 5   87          65+
#> 6   43        40-44
#> 7   14        10-14
#> 8   82          65+
#> 9   59        55-59
#> 10  51        50-54

Created on 2023-02-17 with reprex v2.0.2


Data used

set.seed(1)
New_Data <- data.frame(Age = sample(100, 10))
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87