Value in a third column based on the grouped of other columns

Question

I need to set a label for each id in column a, based on the existing values for this id. For example, if the id 1 only has "F" then the result will be "Female", if only "M" then "Male" and if mixed, then "Mixed".

This is the dataframe base:

    df=data.frame(
      a=c(1,1,1,2,2,3,3,3,3,3),
      b=c("F","M","F","M","M","F","F","F","F","F"))

And this is the expected result:

    df$Result=c("Mixed", "Mixed", "Mixed", "Male", "Male", "Female", "Female", "Female", "Female", "Female")

       a b Result
    1  1 F  Mixed
    2  1 M  Mixed
    3  1 F  Mixed
    4  2 M   Male
    5  2 M   Male
    6  3 F Female
    7  3 F Female
    8  3 F Female
    9  3 F Female
    10 3 F Female

Someone could please help me to calculate this df$Result column? Thanks in advance!

Hello. Can we please see your attempt. This is a simple `ifelse` per group type — Sotos, Aug 07 '19 at 13:10
Possible duplicate of [Using If/Else on a data frame](https://stackoverflow.com/questions/11865195/using-if-else-on-a-data-frame) — NelsonGon, Aug 07 '19 at 13:13
Hi, I think that an if/ifelse query would not help as they would be based in the unique value of the row for that column, but needs to check the rests of the values (I mean, needs to check on all the values which have a 1 in column a and their aggregation). — CrisXP, Aug 07 '19 at 15:42

akrun · Answer 1 · 2019-08-07T13:17:17.527

After grouping by 'a', check the number of distinct elements in 'b'. If it is greater than 1 return "Mixed" or else return the changed label in 'b'

library(dplyr)
df %>%
     mutate(b1 = c("Male", "Female")[(b == "F") + 1]) %>%
     group_by(a) %>%
     mutate(Result = case_when(n_distinct(b) > 1 ~ "Mixed", TRUE  ~ b1)) %>%
     select(-b1)
# A tibble: 10 x 3
# Groups:   a [3]
#       a b     Result
#   <dbl> <chr> <chr> 
# 1     1 F     Mixed 
# 2     1 M     Mixed 
# 3     1 F     Mixed 
# 4     2 M     Male  
# 5     2 M     Male  
# 6     3 F     Female
# 7     3 F     Female
# 8     3 F     Female
# 9     3 F     Female
#10     3 F     Female

data

df <- data.frame(
      a=c(1,1,1,2,2,3,3,3,3,3),
      b=c("F","M","F","M","M","F","F","F","F","F"),
      stringsAsFactors = FALSE)

score 3 · Answer 2 · answered Aug 07 '19 at 13:19

A solution with data.table:

library(data.table)
a = c(1,1,1,2,2,3,3,3,3,3)
b = c("F","M","F","M","M","F","F","F","F","F")
df = data.table(a, b)

df[, result := as.character(uniqueN(b)), a]
df[, result := ifelse(result == "1", ifelse(b == "M", "Male", "Female"), "Mixed")]
df
#     a b result
#  1: 1 F  Mixed
#  2: 1 M  Mixed
#  3: 1 F  Mixed
#  4: 2 M   Male
#  5: 2 M   Male
#  6: 3 F Female
#  7: 3 F Female
#  8: 3 F Female
#  9: 3 F Female
# 10: 3 F Female

Value in a third column based on the grouped of other columns

2 Answers2

data