select the data which does not meet the requirement in R

Question

Let say I have data like this.

    ConDate    ID    TreN  AriT
   20180424  54294631  1  8030
   20180424  54294631  2  8030
   20180425  25005102  1  8040
   20180425  25005102  2  8045

I want to find data which has same conDate,ID,AriT but different TreN.

In this case first and second row should be selected.

I am not sure how to write query for this kind of situation.

If they meet the requirement, then I want to add extra column next to 'AriT' saying Y for meet the requirement and N for Not meet the requirement.

can some one please help me? Thanks!

`df %>% group_by(ConDate, ID, AriT) %>% mutate(new_col = n_distinct(TreN) > 1) ` — Ronak Shah, Aug 28 '18 at 01:38

Maurits Evers · Answer 1 · 2018-08-28T02:34:18.433

1

Perhaps something like this using dplyr::group_by and dplyr::filter?

library(dplyr)
df %>%
    group_by(ConDate, ID, AriT) %>%
    filter(n_distinct(TreN) > 1)
## A tibble: 2 x 4
## Groups:   ConDate, ID, AriT [1]
#   ConDate       ID  TreN  AriT
#     <int>    <int> <int> <int>
#1 20180424 54294631     1  8030
#2 20180424 54294631     2  8030

Sample data

df <- read.table(text =
    "   ConDate    ID    TreN  AriT
   20180424  54294631  1  8030
   20180424  54294631  2  8030
   20180425  25005102  1  8040
   20180425  25005102  2  8045", header = T)

edited Aug 28 '18 at 02:34

answered Aug 28 '18 at 00:21

Maurits Evers

49,617
4
47
68

@RonakShah Yes, I think you're right. Thanks, I made an edit. – Maurits Evers Aug 28 '18 at 02:34
@RonakShah Ah just saw your comment at the top. If you want to post as independent answer I'll remove my edit. – Maurits Evers Aug 28 '18 at 02:35
1

No, it's fine. You had got it right anyway. OP wants a new column though with "Y" and "N" instead of filtering the rows. – Ronak Shah Aug 28 '18 at 02:40

select the data which does not meet the requirement in R

1 Answers1

Sample data