0

I have large list of matrices of different size. Here is the first 6 where you can see matrix of size 1x1 but also the matrix of size 542x1191

List of 627
 $ 1  : num [1, 1] 1
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr "94728_61406"
  .. ..$ : chr "6794602"
 $ 2  : num [1:2, 1:2] 1 0 0 1
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:2] "132479_177215" "67496_29758"
  .. ..$ : chr [1:2] "1008667" "8009082"
 $ 3  : num [1, 1] 1
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr "132479_177215"
  .. ..$ : chr "6740421"
 $ 4  : num [1, 1] 1
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr "20825_2765"
  .. ..$ : chr "6777805"
 $ 5  : num [1:542, 1:1191] 0 0 0 0 0 0 0 0 0 0 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:542] "100090_1753055" "100140_659556" "100173_597660" "100230_231297" ...
  .. ..$ : chr [1:1191] "1001682" "1001990" "1002541" "1002790" ...

I'm trying to get data from this matrices of different size which can look like this (for simplicity I rewrite column names)

                       A       B       C       D
12760600_512333        1       1       0       0
132479_177215          0       0       1       0
84069228_2388656       0       0       0       1


                       A       B       C       D      E
12760600_512333        0       1       0       0      1
132479_177215          1       1       1       0      0
84069228_2388656       0       0       1       1      0

and put them into bigger data.frame which looks like this

    A   B   C   E   F   D   Q   Z   . . .
1   NA  NA  NA  NA  NA  NA  NA  NA
2   NA  NA  NA  NA  NA  NA  NA  NA
3   NA  NA  NA  NA  NA  NA  NA  NA
4   NA  NA  NA  NA  NA  NA  NA  NA
.
.
.

So the column names in each input matrix can consist of different column names and the outpu data frame contains all of these names.

And the output data frame should look like this:

    A   B   C   E   F   D   Q   Z   . . .
1   1   1   0   NA  NA  0   NA  NA
2   0   0   1   NA  NA  0   NA  NA
3   0   0   0   NA  NA  1   NA  NA
4   NA  NA  NA  NA  NA  NA  NA  NA
5   0   1   0   1   NA  0   NA  NA 
6   1   1   1   0   NA  0   NA  NA
7   0   0   1   0   NA  1   NA  NA 
.
.
.

I tried for loop where I try to identify the same colnames and then put the value in this column and appropriate row but it takes a lot of time because I have many matrices of size 500x1100 and bigger and the output data.frame has more then 50.000 columns. I want the output to be data.frame because I don't know how many rows I should use in matrix because input matrices are in Large list of 627 elements and each matrix has different number of rows, so to get complete number of row I would have to make another for loop which I don't want.

This is the code I try (only for one matrix from list, for each matrix there shoul be one more for loop):

dataframe <- as.data.frame(matrix(ncol = nlevels(data1$SKU)))
colnames(dataframe) <- levels(data1$SKU)

for (k in 1:nrow(matrix)){
  for (i in 1:ncol(matrix)){
    for (j in 1:ncol(dataframe)){
      if (colnames(matrix)[i] == colnames(matrix)[j]){
        matrix[k,j] <- dataframe[k,i]
      }
    }
  } 
}

Note: The matrix and dataframe aren't my variable names I know that the matrix is also function.

Thanks for help!

2 Answers2

1

This is easy and relatively efficient with package data.table:

L <- list(cbind(b = 10), 
  cbind(a = 1:2, b = 2:3))

library(data.table)
rbindlist(lapply(L, as.data.table), fill = TRUE)
#    b  a
#1: 10 NA
#2:  2  1
#3:  3  2
Roland
  • 127,288
  • 10
  • 191
  • 288
  • In the same sense, `plyr::rbind.fill(lapply(L, as.data.frame))` – Sotos Aug 04 '17 at 11:43
  • Thanks @Roland rbindlist works!!!, but now I'm wondering how to replace NA by 0, I need some quick way. Because I need two matrices, one with NAs and the second one with zeros instead. I tried: `mat <- lapply(mat, function(x){replace(x, is.na(x),0)})` but there was some error and also [link](https://stackoverflow.com/questions/7235657/fastest-way-to-replace-nas-in-a-large-data-table) but also error – Martina Zapletalová Aug 04 '17 at 18:44
0

if You mean join like sql outer join you should use merge this was also described in How to join (merge) data frames (inner, outer, left, right)?

quick example is like:

m1 <- matrix(data = c(1:5,1:5),ncol = 2)
m2 <- matrix(data = c(5:10,10:15),ncol = 2)
merge(m1,m2,all = TRUE)

and result will be like

   V1 V2
1   1  1
2   2  2
3   3  3
4   4  4
5   5  5
6   5 10
7   6 11
8   7 12
9   8 13
10  9 14
11 10 15
ImmoXZ
  • 75
  • 1
  • 8