Find duplicates in different rows

Question

I have a data frame such this:

   Country1 Country2    year
          A   B         1993
          A   B         1994
          A   C         1993
          A   C         1994
          B   A         1993
          B   A         1994
          B   C         1993 
          B   C         1994

I need to get rid off all the rows with duplications in both column one and two together.

I wrote my own function, but it works too slow on a large dataset. Is there a more effective way?

How is your output required different from `unique` or `duplicated`? — discipulus, Mar 03 '17 at 01:19

score 0 · Answer 1 · answered Mar 03 '17 at 01:27

0

Is this what you were looking for?

 Country1 <- c("A", "A", "A", "A", "B", "B", "B", "B")
 Country2 <- c("B", "B", "C", "C", "A", "A", "C", "C")        
 year <- c("1993", "1994", "1993", "1994", "1993", "1994", "1993", "1994")
 dat <- data.frame(
    Country1,
    Country2,
    year
    )

 dat <- dat[ !duplicated( dat[ ,c(1, 2)]), ]
 dat

   Country1 Country2 year
 1        A        B 1993
 3        A        C 1993
 5        B        A 1993
 7        B        C 1993

answered Mar 03 '17 at 01:27

DryLabRebel

8,923
3
18
24

My apologies, I didnt make clear explanation. I need the data frame look like this: `CountryA <- c("A","A","A","A","B","B") CountryB <- c("B","B","C","C","C","C") year <- c(1993,1994,1993,1994,1993,1994) df <- data.frame(CountryA,CountryB,year) print(df)` The data are on bilateral trade. Thus, the repetition of pairs of countries are extra and i need to get rid off it. Thanks for helping! – oudzi Mar 03 '17 at 10:59

Find duplicates in different rows

1 Answers1