I am using R to clean a dataset. Part of my dataset looks like:
record_id | organization | other_work_loc
1 12 CCC
2 12 AMG
3 12 TAO
4 1
5 2
6 7
other_work_loc is a free response column with highly variable entries. It only has data if organization = 12. I would like to re-categorize the organization and other_work_loc data into one column (org_cat) with three categories (1, 2, 3). Most of the other_work_loc data will be recategorized to '3.'
dataset<- dataset %>% mutate(org_cat = case_when (organization == 1 | organization == 2 ~ '1',
organization >= 3 & organization <12 ~ '2',
other_work_loc == 'CCC' | other_work_loc == AMG ~ '3'))
This code works, but there are 100 free responses in 'other_work_loc.' The majority will be recategorized as '3.' However, 22 need to be categorized as '1' or '2' and I'm wondering if there's a more elegant way than writing out how to recode each individual response?