2

Hi I have df as below:

ID | Gender
1  | M
1  | F
2  | F
2  | F
2  | F
3  | M
3  | M
3  | F
4  | M
4  | M
4  | M

I'd like to distinct filter IDs which have more than 1 Gender (filter dirty data as can't have > 1 Gender per person) Results should be:

ID | Gender
1  | M
1  | F
3  | M
3  | F

How can I go about in R using dplyr?

Sotos
  • 51,121
  • 6
  • 32
  • 66
spidermarn
  • 959
  • 1
  • 10
  • 18

1 Answers1

3

Using dplyr,

library(dplyr)

df %>% 
  group_by(ID) %>% 
  filter(n_distinct(Gender) > 1) %>% 
  distinct(Gender)

which gives,

# A tibble: 4 x 2
# Groups:   ID [2]
  Gender    ID
  <chr>  <int>
1 M          1
2 F          1
3 M          3
4 F          3
Sotos
  • 51,121
  • 6
  • 32
  • 66