0

I have the following error code: duplicate couples (id-time) in resulting pdata.frame when I want to create my panel data.

I already know that there are sometimes two duplicate couples but I just don't know how to fix it? Anyone an idea?

pdata <- pdata.frame(TestTable, index= c("id","date"))
table(index(pdata), useNA = "ifany")

we see that on some dates there are two couples → between 0 and 2

View(table(index(pdata), useNA = "ifany")) 

check again if duplicate couples exist --> TRUE

any(table(index(pdata), useNA = "ifany") > 1)
Helix123
  • 3,502
  • 2
  • 16
  • 36

1 Answers1

2

If you want to remove all duplicate couples (id-time) from your dataset "pdata", you can use the data.table package and function unique from base R in such way:

pdata <- unique(pdata, by = c("id", "date"))

or as an alternative:

library(data.table)
pdata_unique <- unique(pdata[,  c("id", "date"), with = FALSE])
pdata <- merge(pdata_unique, pdata, by = c("id", "date"), all.x = TRUE)
red_quark
  • 971
  • 5
  • 20