Recoding race variables into multiracial category by group

Question

I have been trying to learn the best way to recode variables in a column based on the condition of a name being associated with more than one race.

I have been working with a dataframe like this:

df <- data.frame('Name' = c("Jon", "Jon", "Bobby", "Sarah", "Fred"),
                 'Race' = c("Black", "White", "Asian", "Asian", "Black"))

What I am trying to do is recode any value that appears more than once in a group and transform it into a "multi-racial" category.

The end goal is to construct a dataframe like below:

df1 <- data.frame('Name' = c("Jon", "Bobby", "Sarah", "Fred"),
                 'Race' = c("Multiracial", "Asian", "Asian", "Black"))

The way I currently am doing it is by getting a list of people with multiple answers grouping race by name. Then, get a list of the names with more than one answer and for the names with more than one answer only, replace the race with "multi-racial". Code shown below:

df1 <- unique(df[, c('Name', 'Race')])

multi_answer <-
  df1 %>%
  dplyr::group_by(Name) %>%
  dplyr::summarise(n_answers = n_distinct(Race))

multi_answer <- multi_answer[multi_answer$n_answers >1,]
df1[df1$Name %in% c(multi_answer$Name), 'Race'] <- 'multi-racial'
df1 <- unique(df1)

score 2 · Accepted Answer · answered Oct 26 '22 at 21:45

You can just group_by the Name and then summarize the data. You just use the condition of "if there is more than one entry" (i.e., n() > 1):

library(tidyverse)

df |>
  group_by(Name)|>
  summarise(Race = ifelse(n() > 1, "multi-racial", Race))
#> # A tibble: 4 x 2
#>   Name  Race        
#>   <chr> <chr>       
#> 1 Bobby Asian       
#> 2 Fred  Black       
#> 3 Jon   multi-racial
#> 4 Sarah Asian

Recoding race variables into multiracial category by group

1 Answers1