-1

I have a big dataframe as 'data', lets consider this:

`data <- data.frame(DATE = c(2012, 2012, 2013, 2014, 2014, 2015),
             NAME = c("A", "G", "N", "L",'L' "L"),
             LCR = c(1, 3, 5, 4, 5, 1),
             MWFR=c(0,0,0,0,0,0,1,1),
             reg=c(1,1,0,0,1,1,1,1))
 

and I want to run a regression but when I run it I get this error:

pdata <- pdata.frame(data, index = c("NAME", "DATE"))
regmodelfix<- plm(LCR ~ MWFR+reg+ MWFR*reg , model ='within',data=pdata, effect = 'twoways')

error :duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting

I realized that I need to make unique name in each rows of NAME and DATE columns so I tried make.names

rownames(data) <- make.names(data$NAME, unique = TRUE)

but it does not work! Any idea?

  • 1
    Your question is not reproducible since you did not include a data set that reproduces your issue, but you could try to first copy your rownames into a temp column `rownames(data) -> data$tempname`, and then delete them `rownames(data) <- NULL`. – Otto Kässi Nov 23 '21 at 10:01
  • I dont want remove the rows just want to rename them – Marzieh Karimi Nov 23 '21 at 10:08
  • `rownames(data) <- NULL` does not remove rows, just their names. – Otto Kässi Nov 23 '21 at 10:08
  • I tried to edit my question – Marzieh Karimi Nov 23 '21 at 10:08
  • please attach your data using dput() – Otto Kässi Nov 23 '21 at 10:09
  • is there any code to just rename the rows of NAME and DATE columns? – Marzieh Karimi Nov 23 '21 at 10:10
  • 1
    Your issue is related to rownames(data), not data$NAME or data$DATE. You can rename your data$NAME column by e.g. data$NAME <- seq(1, nrow(data), 1), or data$NAME <- "my_new_name", or whatever you want, but I suspect that will not help you. – Otto Kässi Nov 23 '21 at 10:13
  • 1
    @OttoKässi `rownames(data) <- NULL` does not delete the row names, it deletes and recreates them with consecutive integers coerced to character. It guarantees unique row names, like the OP wants. `rownames(data) <- NULL` *is* the solution. – Rui Barradas Nov 23 '21 at 10:14
  • This issue is related to your issue posted here: https://stackoverflow.com/questions/70044719/plm-in-two-way-fixed-effects-model-with-individual-firm-and-time-fixed-effects . Already the line with `pdata.frame` should give a warning. Once you fix the warning, the estimation by `plm` should work. – Helix123 Nov 23 '21 at 10:22
  • @RuiBarradas I used these codes : `rownames(data) <- NULL` `pdata <- pdata.frame(m, index = c("NAME", "DATE"))` then `regmodelfix<- plm(LCR ~ MWFR+reg+ MWFR*reg+lagTA+lagDR+lagROAA+lagTCR ,model ='within',data=data, effect = 'twoways') ` and I got this error : `Error in get(.Generic)(e1, e2) : non-numeric argument to binary operator In addition: Warning message: – Marzieh Karimi Nov 23 '21 at 11:12
  • `In pdata.frame(data, index) : duplicate couples (id-time) in resulting pdata.frame to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")` – Marzieh Karimi Nov 23 '21 at 11:12
  • @Helix123 I ran `table(index(data), useNA = "ifany")` and I got table of data with 1 value . so how can I fix it ? – Marzieh Karimi Nov 23 '21 at 11:15
  • rather run it on your pdata.frame `table(index(pdata), useNA = "ifany")`. Your data has doublicates for id-time combination (or is interpreted as such by `pdata.frame`), which is not allowed for panel data (with 2 dimensions). – Helix123 Nov 23 '21 at 11:40
  • Also, people could help way better if you could provide a reproducible example. – Helix123 Nov 23 '21 at 11:42
  • @Helix123 I did it here – Marzieh Karimi Nov 23 '21 at 12:03
  • the code you provided in your post is not reproducible (it erros at the first command). I provided an answer with reproducible code. – Helix123 Nov 26 '21 at 11:53

1 Answers1

0

The code in the original post is not reproducible. I provide a reproducible example and highlight the issue below:

data <- data.frame(DATE = c(2012, 2012, 2013, 2014, 2014, 2015),
                   NAME = c("A", "G", "N", "L",'L', "L"),
                   LCR  = c(1,    3,   5,   4,  5,   1),
                   MWFR = c(0,    0,   0,   0,  0,   1),
                   reg  = c(1,    1,   0,   0,  1,   1))

library(plm)
pdata <- pdata.frame(data, index = c("NAME", "DATE"))
#> Warning in pdata.frame(data, index = c("NAME", "DATE")): duplicate couples (id-time) in resulting pdata.frame
#>  to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")

# use hint given by the warning from pdata.frame
table(index(pdata), useNA = "ifany")
#>     DATE
#> NAME 2012 2013 2014 2015
#>    A    1    0    0    0
#>    G    1    0    0    0
#>    L    0    0    2    1
#>    N    0    1    0    0

The table indicates that for Name = L two entries for Date = 2014 are contained in the panel data set. These duplicate couples of id-time are not allowed in panel data as the identifier (combined from the two dimensions id and time) needs to be unique in the data set.

This can also be detected when looking at the data directly, but is harder to spot and not pratical with large data sets:

print(data)
#>   DATE NAME LCR MWFR reg
#> 1 2012    A   1    0   1
#> 2 2012    G   3    0   1
#> 3 2013    N   5    0   0
#> 4 2014    L   4    0   0
#> 5 2014    L   5    0   1
#> 6 2015    L   1    1   1
Helix123
  • 3,502
  • 2
  • 16
  • 36