1

I need to set a label for each id in column a, based on the existing values for this id. For example, if the id 1 only has "F" then the result will be "Female", if only "M" then "Male" and if mixed, then "Mixed".

This is the dataframe base:

    df=data.frame(
      a=c(1,1,1,2,2,3,3,3,3,3),
      b=c("F","M","F","M","M","F","F","F","F","F"))

And this is the expected result:

    df$Result=c("Mixed", "Mixed", "Mixed", "Male", "Male", "Female", "Female", "Female", "Female", "Female")

       a b Result
    1  1 F  Mixed
    2  1 M  Mixed
    3  1 F  Mixed
    4  2 M   Male
    5  2 M   Male
    6  3 F Female
    7  3 F Female
    8  3 F Female
    9  3 F Female
    10 3 F Female

Someone could please help me to calculate this df$Result column? Thanks in advance!

jogo
  • 12,469
  • 11
  • 37
  • 42
CrisXP
  • 31
  • 4
  • 1
    Hello. Can we please see your attempt. This is a simple `ifelse` per group type – Sotos Aug 07 '19 at 13:10
  • 1
    Well...nevermind lol – Sotos Aug 07 '19 at 13:12
  • Possible duplicate of [Using If/Else on a data frame](https://stackoverflow.com/questions/11865195/using-if-else-on-a-data-frame) – NelsonGon Aug 07 '19 at 13:13
  • Hi, I think that an if/ifelse query would not help as they would be based in the unique value of the row for that column, but needs to check the rests of the values (I mean, needs to check on all the values which have a 1 in column a and their aggregation). – CrisXP Aug 07 '19 at 15:42

2 Answers2

3

After grouping by 'a', check the number of distinct elements in 'b'. If it is greater than 1 return "Mixed" or else return the changed label in 'b'

library(dplyr)
df %>%
     mutate(b1 = c("Male", "Female")[(b == "F") + 1]) %>%
     group_by(a) %>%
     mutate(Result = case_when(n_distinct(b) > 1 ~ "Mixed", TRUE  ~ b1)) %>%
     select(-b1)
# A tibble: 10 x 3
# Groups:   a [3]
#       a b     Result
#   <dbl> <chr> <chr> 
# 1     1 F     Mixed 
# 2     1 M     Mixed 
# 3     1 F     Mixed 
# 4     2 M     Male  
# 5     2 M     Male  
# 6     3 F     Female
# 7     3 F     Female
# 8     3 F     Female
# 9     3 F     Female
#10     3 F     Female

data

df <- data.frame(
      a=c(1,1,1,2,2,3,3,3,3,3),
      b=c("F","M","F","M","M","F","F","F","F","F"),
      stringsAsFactors = FALSE)
akrun
  • 874,273
  • 37
  • 540
  • 662
3

A solution with data.table:

library(data.table)
a = c(1,1,1,2,2,3,3,3,3,3)
b = c("F","M","F","M","M","F","F","F","F","F")
df = data.table(a, b)

df[, result := as.character(uniqueN(b)), a]
df[, result := ifelse(result == "1", ifelse(b == "M", "Male", "Female"), "Mixed")]
df
#     a b result
#  1: 1 F  Mixed
#  2: 1 M  Mixed
#  3: 1 F  Mixed
#  4: 2 M   Male
#  5: 2 M   Male
#  6: 3 F Female
#  7: 3 F Female
#  8: 3 F Female
#  9: 3 F Female
# 10: 3 F Female
Masood Sadat
  • 1,247
  • 11
  • 18