0

Data example:

library(dplyr)
df1 <- data.frame(
ID = c(1,2,2,3,4,4,5,6,6,8,9,33,32,33,33,22,22,22,23,23),
product= c("A", "B", "B", "A", "C", "D", "C", "A", "F", "A", 
"C", "P", "R", "Q", "W", "A", "B","B", "D", "D"))

The Problem:

1) If the "ID"" and the "product"" features has a "product" type that is different within the same "ID", then exclude those observations. For example "ID": 4, 4 and "product": C, D

2) If the "ID" and the "product" features have the same repeated value, then leave those observations. For example "ID": 2, 2 and "product": B, B.

3) If the "ID"" and the "product"" have only one observation, then leave that observation in the data frame. For example "ID": A and "product": 1.

What I have tried:

df1 %>%
   group_by(ID, product) %>%
   filter(n() > 1)

Expected Result

ID product
1   A           
2   B           
2   B           
3   A           
5   C           
8   A           
9   C           
32  R           
23  D
23  D
Loncar
  • 125
  • 8
  • Please show your expected output for better understanding – akrun Aug 14 '18 at 13:51
  • `group_by(df1, ID) %>% filter(!length(unique(product)) > 1)` – r2evans Aug 14 '18 at 13:51
  • Ronak Shah first of all thanks for replying. Your result is currently not yet correct, because it grabs values that have different product, but the same ID. It does not yet satisfy the first problem point. r2evans is as well almost correct, but it does not grab the 23 ID and D product for some reason. Does anyone know why is that so? The code looks great and logical though. – Loncar Aug 14 '18 at 14:09
  • `df1 %>% group_by(ID) %>% filter(n_distinct(product) == 1)`? (r2evans already provides this result, btw.) – Frank Aug 14 '18 at 14:14
  • Ei Frank, thanks for replying. When you put the result in your Rstudio, it does not yield the results form the expected results part in my above post. ) SECOND POINT: If the "ID" and the "product" features have the same repeated value, then leave those observations. For example "ID": 2, 2 and "product": B, B. Your's and r2evans captures everything, except it does not give back 23 D, 23 D, but instead only one 23 D. Compare it with 2 B 2 B result. It should print like that. – Loncar Aug 14 '18 at 14:20
  • I have reopened the post but `df1 %>% group_by(ID) %>% filter(n_distinct(product) == 1)` gives me your expected output with two 23 D 23 D. – Ronak Shah Aug 14 '18 at 14:37
  • Pardon, you are right. I restarted R and afterwards it gave the right answer. Thank you! – Loncar Aug 14 '18 at 15:03

0 Answers0