Subset a dataframe - grouped and multiple values R

Question

I am trying to subset a dataframe by multiple values in one column.

The input is the following:

Output should be:

So i want only the rows in the new dataframe, which contains 0 AND 1 in the column "Auto" - grouped for each shop.

Already tried this, but doesnt work:

test <- subset(rawdata, Auto == 0 &Auto == 1) test <- subset(rawdata, min(Auto) == 0 & max(Auto) == 1) test<- rawdata[ which(rawdata$Auto'==0 & rawdata$Auto== 1), ]

Thanks for any help. Regards

Your question seems to be ill-defined. "rows in the new dataframe, which contains 0 AND 1 in the column "Auto"." would imply that there are cells in `Auto` that contain both 0 and 1 simultaneously (which is not the case in your example). Were you trying to imply some kind of grouping instead? Your desired output, for instance, suggests you're grouping by the `Shop` column. — Dunois, Mar 11 '22 at 11:04
Yes you re right. Already tried something like this: test5 = data1%>% group_by(Shop)%>% subset(data1, `Is Automatic` == 0 &`Is Automatic` == 1) But think i need a function after grouping, which checks the whole column for 0 and 1. — user17549713, Mar 11 '22 at 11:18
Your example is not helpful, at least for me. You say you want only rows with 0 AND 1 in `Auto`, but none of the cells has both values, as pointed out by Dunois. Your desired output can be obtained just subsetting by `Shop == 1`. I cannot understand what you're trying to obtain. — Claudio, Mar 11 '22 at 11:24
For Example - there are 3 Rows for Shop 1 containing Auto ( 0 / 0 / 1 ); For Shop 2 is Auto ( 1 / 1 / 1 ) ; For Shop 3 is Auto ( 0 / 0 / 0 ) > I now want to extract the rows containing the Shops with both ( 0 & 1 ) in "Auto". In this example this is only Shop 1. — user17549713, Mar 11 '22 at 11:29

score 2 · Accepted Answer · answered Mar 11 '22 at 11:33

Please do not add data as images, provide data in a reproducible format

You can select the Shop where both 0 and 1 are present.

library(dplyr)

df %>%
  group_by(Shop) %>%
  filter(all(c(0, 1) %in% Auto)) %>%
  ungroup

#   Shop Order  Auto
#  <dbl> <dbl> <dbl>
#1     1     1     0
#2     1     2     0
#3     1     3     1

data

df <- structure(list(Shop = c(1, 1, 1, 2, 2, 2, 3, 3, 3), Order = c(1, 
2, 3, 1, 2, 3, 1, 2, 3), Auto = c(0, 0, 1, 1, 1, 1, 0, 0, 0)), 
class = "data.frame", row.names = c(NA, -9L))

score 1 · Answer 2 · answered Mar 11 '22 at 11:31

It is not very clear what you are trying to do based of your question. If I interpreted it correctly, you want to keep every row of shops where 1s and 0s occur.

To do this one possible solution might be to count the number of rows that each shop has and check wether that value is the same as the sum of auto (means all 1s) or equal to 0 (means all 0s).

If that criteria is met you want all rows of the shop to be excluded.

Look into the function summarise.

score 1 · Answer 3 · answered Mar 11 '22 at 11:32

Is this what you're looking for?

library(magrittr)
library(dplyr)

#Toy data.
df <- data.frame(Shop = c(1, 1, 1, 2, 2, 2, 3, 3, 3), 
                 Order = c(1, 2, 3, 1, 2, 3, 1 , 2, 3), 
                 Auto = c(0, 0, 1, 1, 1, 1, 0, 0, 0))

#Solution.
df %>% 
  group_by(Shop) %>%
  filter(n_distinct(Auto) > 1) %>%
  ungroup()

# # A tibble: 3 × 3
#    Shop Order  Auto
#   <dbl> <dbl> <dbl>
# 1     1     1     0
# 2     1     2     0
# 3     1     3     1

The key idea here is using dplyr::n_distinct() to count the number of unique values in Auto within each Shop group, and subsequently retaining only those groups that have more than 1 n_distinct values.

Ah, this should be a possible solution! Really appreciate your help. Thanks — user17549713, Mar 11 '22 at 11:34

Subset a dataframe - grouped and multiple values R

3 Answers3