1

I want to delete a group from the data frame if the group has same values in the column 'Reading'.

I want to group the data frame by ID and then check if each value in the column reading is the same for a group. If all the values are same for the group then I want to delete all the rows for the group.

Below is a data frame with three columns.

library(dplyr)
df <- tibble(
  Date = c('2019/1/1', '2019/1/2', '2019/2/2', '2019/2/5', '2019/2/7'),
  ID = c('a', 'a', 'b', 'b', 'b'),
  Reading = c(1, 1, 2, 1, 1)
)
df$Date = as.Date(dfa$Date)
> df
# A tibble: 5 x 3
  Date       ID    Reading
  <date>     <chr> <dbl>
1 2019-01-01 a         1
2 2019-01-02 a         1
3 2019-02-02 b         2
4 2019-02-05 b         1
5 2019-02-07 b         1

I thought about using distinct in order to reduce the data frame.

df %>% group_by(ID) %>% 
  distinct(Reading, .keep_all = TRUE) %>% 
  filter(n()>1) %>% 
  ungroup

# A tibble: 2 x 3
  Date       ID    Reading
  <date>     <chr>   <dbl>
1 2019-02-02 b           2
2 2019-02-05 b           1

This however also deleted the distinct value for the group b in column ID.

My desired output is

# A tibble: 3 x 3
  Date       ID    Reading
  <date>     <chr> <dbl>
1 2019-02-02 b         2
2 2019-02-05 b         1
3 2019-02-07 b         1
camille
  • 16,432
  • 18
  • 38
  • 60
Sid
  • 123
  • 8

1 Answers1

3

We group by 'ID', and filter where the 'Reading' have more than one unique elements (n_distinct)

library(dplyr)
df %>%
   group_by( ID) %>%
   filter(n_distinct(Reading) > 1)
akrun
  • 874,273
  • 37
  • 540
  • 662