R: Error in pi[[j]] : subscript out of bounds -- rbind on a list of dataframes

Question

I am trying to rbind a large list of data frames (outputDfList), which is generated by lapply a complicated function to a large table. You can recreate outputDfList by:

df1=data.frame("randomseq_chr15q22.1_translocationOrInsertion", "chr15", "63126742")
names(df1)=NULL
df2=df1=data.frame("chr18q12.1_chr18q21.33_large_insertion", "chr18 ", "63126741")
names(df2)=NULL
outputDfList=list(df1,df2)

my code is

do.call(rbind, outputDfList)

The error message I received:

Error in pi[[j]] : subscript out of bounds

I double checked the column numbers of each dataframes and they are all the same. I also tried to use "options(error=recover)" for debug, but I'm not familiar with it enough to pitch down the exact issue. Any help is appreciated. Thank you.

I’m unable to reproduce the error message. You’ll need to construct a minimal example to reproduce the problem, and post the exact code/data to reproduce it here. [reprex may be helpful for that.](http://jennybc.github.io/reprex/) — Konrad Rudolph, Jan 16 '17 at 17:39
@KonradRudolph Thanks a lot for the comment. You are right. I added back the long names of my dataframes and I think now it should show the error. — Helene, Jan 16 '17 at 18:42
Unfortunately this isn’t sufficient since we still don’t know exactly what your data looks like (if I try reconstructing your data from what you’ve posted, the command works). Could you please `dput` the relevant data? — Konrad Rudolph, Jan 16 '17 at 19:22
@KonradRudolph Thank you for being so patient. I could not dput the original data because the outputDfList is generated by lapply a complicated function to a table. However, I was able to reproduce the error using the code above. Would you please try the code and let me know if you could see the error please? Thanks a lot. — Helene, Jan 16 '17 at 19:28
Why are you setting the column names to NULL? rbind is trying to match up columns by name - difficult if there aren't any — Richard Telford, Jan 17 '17 at 21:20
@RichardTelford You are right. I didn't realize that. I set it to NULL to mimic my original code. The dataframes were generated with different colnames by default, so I had to reset them. Now it is fixed thank you. — Helene, Jan 17 '17 at 22:42

score 7 · Accepted Answer · answered Jan 16 '17 at 19:34

7

After the update it seems that your problem is that you have invalid column names: Data frame column names must be non-null.

After correcting this, the code then works:

for (i in seq_along(outputDfList)) {
    colnames(outputDfList[[i]]) = paste0('V', seq_len(ncol(outputDfList[[i]])))
}

do.call(rbind, outputDfList)
#                                       V1     V2       V3
# 1 chr18q12.1_chr18q21.33_large_insertion chr18  63126741
# 2 chr18q12.1_chr18q21.33_large_insertion chr18  63126741

However, I’m puzzled how this situation occurred in the first place. Furthermore, the error message I’m getting with your code is still distinct from yours:

Error in match.names(clabs, names(xi)) :
names do not match previous names

answered Jan 16 '17 at 19:34

Konrad Rudolph

530,221
131
937
1,214

Thanks for the reply. I am puzzled by it as well... but you are absolutely right about I need column names for my data frames. I added this to the function which generated the list of dataframes, and it worked. Thank you! – Helene Jan 16 '17 at 20:19
1

I've seen both errors now. I was trying to call `do.call(rbind, myList)` on a list of data frames when I got the match.names error. The data frames all had different column names so I used `lapply(myList, unname)` thinking this would fix the problem but then when I tried `do.call()` again, I got the subscript out of bounds error described above. As described in the comments above, this has the effect of setting the column names to NULL so `rbind()` fails. – syntonicC Apr 11 '18 at 21:05

R: Error in pi[[j]] : subscript out of bounds -- rbind on a list of dataframes

1 Answers1

Linked