Is there a way to filter out an entire group from a tibble based on the rows within that group?

Question

If I have a tibble where each row represents a component of some object and multiple components share and object, is there a way to analyze all of a given objects components and remove its corresponding rows if it doesn't match some condition?

For example, lets say I want to clean up the table below

tib <- tibble(object = c("a", "a", "a", "a", "b", "b", "b"),
       component = c("x", "x", "y", "z", "x", "y", "y"),
       data = 1:7)

I know an object must contain exactly one component "x" and thus object "a" is not valid because it has two. So all four rows corresponding with object "a" need to be removed.

I know that the filter() function can work on whole groups but I'm struggling to find a way to analyze the group within the filter function. The closest I think I've come is below but it doesn't work at all. Maybe I'm completely off.

tib %>%
  group_by("object") %>%
  filter(count(cur_data(), component)$b != 1)

TimTeaFan · Accepted Answer · 2022-11-10T11:11:04.817

We can check for the condition sum(component == "x") < 2 in each group:

library(dplyr)

tib %>% 
  group_by(object) %>% 
  filter(sum(component == "x") < 2)

#> # A tibble: 3 x 3
#> # Groups:   object [1]
#>   object component  data
#>   <chr>  <chr>     <int>
#> 1 b      x             5
#> 2 b      y             6
#> 3 b      z             7

Alternatively, we can use unlist(table(component))["x]" to see how often component == "x" occurs in each group. Then we can filter those groups where this condition == 1. This approach is more flexible, when we want to check the occurrence of more than one variable.

library(dplyr)

tib %>% 
  group_by(object) %>% 
  filter(unlist(table(component))["x"] == 1L) 

#> # A tibble: 3 x 3
#> # Groups:   object [1]
#>   object component  data
#>   <chr>  <chr>     <int>
#> 1 b      x             5
#> 2 b      y             6
#> 3 b      z             7

^{Created on 2022-11-10 by the reprex package (v2.0.1)}

Sorry I should have been more clear. It IS ok for there to be multiple or none of a component "y" or a component "z". But only component "x" has the requirement of exactly one. — 3luke33, Nov 10 '22 at 11:02
@3luke33 if you want to check only if an object has one `component == "x"` then see may updated answer. — TimTeaFan, Nov 10 '22 at 11:05

Is there a way to filter out an entire group from a tibble based on the rows within that group?

1 Answers1