0

My problem is as follows, I need to determine the same variants with a different outcome in R. So I have:

Variant Outcome
1 A
1 A
1 A
1 B
2 C
2 C
2 C
3 D
3 E
4 F
4 F
4 F
4 G
4 G
5 H
5 H
5 H
5 H

And my output should be: Variant 1, 3, & 4

I know about the functions duplicated, intersect etc but have no clue how to combine them to get what I want.

M. waal
  • 15
  • 5
  • 2
    Note you need to use `n_distinct()` rather than `n()` as suggested in the dup post. i.e. `df %>% group_by(Variant) %>% filter(n_distinct(Outcome) >= 2)` with the `dplyr` package. – benson23 Mar 12 '23 at 11:40
  • I believe this would definitely be a dup question, but the dup link seems to be not accurate enough. [Select groups with more than one distinct value](https://stackoverflow.com/questions/31649049/select-groups-with-more-than-one-distinct-value) might be more accurate – benson23 Mar 12 '23 at 11:41
  • Don't know how, but this works indeed! I also never use the %>%. But it does the job, thanks! – M. waal Mar 12 '23 at 13:01

1 Answers1

1

Base R solution:

with(df, df[ave(Outcome, Variant, FUN = function(x) length(unique(x))) >= 2, ])

   Variant Outcome
1        1       A
2        1       A
3        1       A
4        1       B
8        3       D
9        3       E
10       4       F
11       4       F
12       4       F
13       4       G
14       4       G
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • For some reason it doesn't like the \. It gives me the following error: Error: unexpected input in "with(tt_fin, tt_fin[ave(filtered_maf, variant, FUN = \" – M. waal Mar 12 '23 at 13:04
  • 2
    @M.waal You may have an older version of R. You could change the `\(x)` to `function(x)` and it should work – akrun Mar 12 '23 at 14:18
  • Indeed, with function(x) it worked. Thank you both for your help! – M. waal Mar 12 '23 at 23:09