filter count distinct > 1

Question

Hi I have df as below:

ID | Gender
1  | M
1  | F
2  | F
2  | F
2  | F
3  | M
3  | M
3  | F
4  | M
4  | M
4  | M

I'd like to distinct filter IDs which have more than 1 Gender (filter dirty data as can't have > 1 Gender per person) Results should be:

ID | Gender
1  | M
1  | F
3  | M
3  | F

How can I go about in R using dplyr?

Possible duplicate : https://stackoverflow.com/questions/31649049/select-groups-with-more-than-one-distinct-value — Ronak Shah, Feb 07 '20 at 08:01
@Sotos Well, if you wrap `unique` in all of those solutions it would give you expected output. In any case, it is not an "exact" duplicate. — Ronak Shah, Feb 07 '20 at 08:05

score 3 · Accepted Answer · answered Feb 07 '20 at 07:57

Using dplyr,

library(dplyr)

df %>% 
  group_by(ID) %>% 
  filter(n_distinct(Gender) > 1) %>% 
  distinct(Gender)

which gives,

# A tibble: 4 x 2
# Groups:   ID [2]
  Gender    ID
  <chr>  <int>
1 M          1
2 F          1
3 M          3
4 F          3

1 Answers1