1

I have a list of matrices that all have the same number of columns but that have varying number and naming of rows. They look something like this:

$Name1
                  c1 c2 c3 c4 c5 c6
Spec1              0  2  0  1  0  0   
Spec2              1  0  1  0  0  0
Spec3              1  0  1  0  0  0

$Name2
                  c1 c2 c3 c4 c5 c6
Spec1              0  0  0  0  1  0   
Spec4              0  0  0  1  0  0
Spec5              0  0  0  0  0  1

I'm trying to get them all into one one dataframe, while preserving both the rownames as well as the names of the matrices. Something like this is what I'm trying to get:

                        c1 c2 c3 c4 c5 c6
Name1Spec1              0  2  0  1  0  0   
Name1Spec2              1  0  1  0  0  0
Name1Spec3              1  0  1  0  0  0
Name2Spec1              0  0  0  0  1  0   
Name2Spec4              0  0  0  1  0  0
Name2Spec5              0  0  0  0  0  1

do.call(rbind,...) gets the data how I want it, but I haven't been able to figure out how to get the names to be preserved or concatenate like that. I've also tried a few ways to make the name list separately and failed on those fronts. The final dataframe should be 1113 rows, but there are 358 matrices in the list. I've tried many inelegant things, but I figure something like this should be close?

list.names<-list()
for(i in 1:length(ListofMatrices)){
  list.names[i]<-rownames(ListofMatrices[[i]])
}

I feel like I'm missing something plainly obvious with lapply or setting up a loop.

MSIM
  • 13
  • 3
  • Related: [Combine (rbind) data frames and create column with name of original data frames](https://stackoverflow.com/questions/15162197/combine-rbind-data-frames-and-create-column-with-name-of-original-data-frames) – Henrik Oct 29 '18 at 21:34
  • 2
    BTW, MSIM, in the recent tide of first-time-askers on SO, the majority (that I read, mostly [tag:r]) do not include good sample data, good sample code, or expected output ... if this is truly your first question, thank you for taking the time to frame it so well. (Even if it isn't, and you've been learning ... thanks.) – r2evans Oct 29 '18 at 21:51
  • I've been lurking on here for many years, just hadn't needed to ask a question before! – MSIM Oct 30 '18 at 13:31

2 Answers2

3

There shouldn't be a need to use a for loop. If l is your list of frames ...

do.call(rbind, l)
#             c1 c2 c3 c4 c5 c6
# Name1.Spec1  0  2  0  1  0  0
# Name1.Spec2  1  0  1  0  0  0
# Name1.Spec3  1  0  1  0  0  0
# Name2.Spec1  0  2  0  1  0  0
# Name2.Spec4  1  0  1  0  0  0
# Name2.Spec5  1  0  1  0  0  0
# Name2.Spec6  1  0  1  0  0  0

Is a close-match for what you asked for: just an additional dot in the row name. If you really want that removed, two options:

rn <- mapply(paste0, rep(names(l), sapply(l, nrow)), unlist(sapply(l, rownames)))
rn
#        Name1        Name1        Name1        Name2        Name2        Name2        Name2 
# "Name1Spec1" "Name1Spec2" "Name1Spec3" "Name2Spec1" "Name2Spec4" "Name2Spec5" "Name2Spec6" 
out <- do.call(rbind, l)
rownames(out) <- rn
out
#            c1 c2 c3 c4 c5 c6
# Name1Spec1  0  2  0  1  0  0
# Name1Spec2  1  0  1  0  0  0
# Name1Spec3  1  0  1  0  0  0
# Name2Spec1  0  2  0  1  0  0
# Name2Spec4  1  0  1  0  0  0
# Name2Spec5  1  0  1  0  0  0
# Name2Spec6  1  0  1  0  0  0

or

out <- do.call(rbind, l)
rownames(out) <- gsub("\\.", "", rownames(out))

(though the latter will be wrong if you naturally have dots in any of the names).


Data. (I added one row in the second frame to ensure that the new row naming is correct.)

l <- setNames(list(
  read.table(header=TRUE, text='
                  c1 c2 c3 c4 c5 c6
Spec1              0  2  0  1  0  0   
Spec2              1  0  1  0  0  0
Spec3              1  0  1  0  0  0'),
  read.table(header=TRUE, text='
                  c1 c2 c3 c4 c5 c6
Spec1              0  2  0  1  0  0   
Spec4              1  0  1  0  0  0
Spec5              1  0  1  0  0  0
Spec6              1  0  1  0  0  0')
), c("Name1", "Name2"))
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    I agree this is the right way to go if the rownames have to be adjusted (and have upvoted), but doesn't just `do.call(rbind, l)` give rownames like `Name1.SpecN` out-of-the-box? – thelatemail Oct 29 '18 at 23:01
  • Yes it does ... I should have included that. I went with the verbatim approach to match the OP, but you're right, that should be included. – r2evans Oct 29 '18 at 23:02
  • 1
    The do.call(rbind, l) simply adds a ".N" to the end of rows which have the same rowname from different matrices, and doesn't include the "NameN" part from each matrix. The mapply bit is exactly what I needed though, thanks! – MSIM Oct 30 '18 at 13:29
0

An alternative solution using purrr::map and dplyr (which may or may not be easier/more intuitive than r2evans's solution):

# Recreate your data:
test <- list(Name1 = matrix(data = c(0,2,0,1,0,0,1,0,1,0,0,0,1,0,1,0,0,0), 
                        row = 3, ncol = 6,
                        dimnames = list(c("Spec1", "Spec2", "Spec3"), 
                                        c("c1", "c2", "c3", "c4", "c5", "c6"))),
         Name2 = matrix(data = c(0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1), 
                        nrow = 3, ncol = 6,
                        dimnames = list(c("Spec1", "Spec4", "Spec5"), 
                                        c("c1", "c2", "c3", "c4", "c5", "c6"))))

df <- map_dfr(1:length(test), ~test[[.x]] %>% 
      as.data.frame() %>% 
      mutate(items = names(test[.x]),
             specs = row.names(test[[.x]]),
             combined_names = paste0(items, specs)) %>% 
      select(9, 1:6))

df
  combined_names c1 c2 c3 c4 c5 c6
1     Name1Spec1  0  1  1  0  1  0
2     Name1Spec2  2  0  0  0  0  0
3     Name1Spec3  0  0  1  0  1  0
4     Name2Spec1  0  0  0  1  0  0
5     Name2Spec4  0  1  0  0  0  0
6     Name2Spec5  0  0  0  0  0  1

This may be a bit easier to parse if we pull out the conversion as its own function:

df_extractor <- function(x) {
  test[[x]] %>% as.data.frame() %>% # Take the data from each matrix and convert it into a data frame
    mutate(items = names(test[x]), # This extracts the name of each list
           specs = row.names(test[[x]]), # This extracts the original row names
           combined_names = paste0(items, specs)) %>% # Concatenate them together in your style above
    select(9, 1:6) # Select and reorder columns.
}

df <- map_dfr(1:length(test), ~df_extractor(.x)) # use map_dfr to bind the resulting data frames together.
benc
  • 376
  • 1
  • 6