-1

my issue is that within a loop for every i - a matrix like this outputted

structure(c(8L, 4L, 3L, 4L, 1L, 8L, 28L, 32L, 24L, 32L, 8L, 64L, 
0L, 6L, 12L, 16L, 4L, 32L, 0L, 0L, 3L, 12L, 3L, 24L, 0L, 0L, 
0L, 6L, 4L, 32L, 0L, 0L, 0L, 0L, 0L, 8L, 0L, 0L, 0L, 0L, 0L, 
28L), .Dim = 6:7, .Dimnames = structure(list(c("ESN", "GWD", 
"LWK", "MSL", "PEL", "YRI"), c("ACB", "ESN", "GWD", "LWK", "MSL", 
"PEL", "YRI")), .Names = c("", "")), class = "table")

this matrix counts pariwise sharing - these counts should now be added to a larger table - with more levels than only the 7 present in this table. It is always a symmetric matrix (so the upper triangl) can be neglected

the real table (for which all elements are 0 in the beginnign)

matr<-matrix(0,nrow=26,ncol=26)
pop<-c("CHB","JPT","CHS","CDX","KHV","CEU","TSI","FIN","GBR","IBS","YRI","LWK","GWD","MSL","ESN","ASW","ACB","MXL","PUR","CLM","PEL","GIH","PJL","BEB","STU","ITU")

rownames(matr)<-pop
colnames(matr)<-pop

Can somebody tell me how I can add these counts from the small table to the large table (in the correct field) in an efficient way? I need to update the table 100k time - so effectiveness would be good. As mentioned addiing in the lower triangle is fine....

EDI #####

so another data set - might look like (this would then be generated from the next iteration of the loop)

structure(c(1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L), .Dim = c(3L, 
3L), .Dimnames = structure(list(c("IBS", "MXL", "TSI"), c("GBR", 
"IBS", "MXL")), .Names = c("", "")), class = "table")

this should then also be added to matr - if a field has a number in it previously, the two number should be added up

Thanks

kutyw
  • 95
  • 4
  • You'll need [matrix-indexing](http://stackoverflow.com/questions/6920441/index-values-from-a-matrix-using-row-col-indicies) while assigning; i.e. in each step (having created a table "tab") you'll want to update the `cbind(rownames(tab)[row(tab)], colnames(tab)[col(tab)])` indices of "matr" by adding "tab". – alexis_laz Jun 01 '16 at 09:49
  • @alexi_laz - but how to only add to the lower diagonal ? So if a count is added above the diagonal - it has to be (moved) below the diagonal .... – kutyw Jun 01 '16 at 09:51
  • Unless I'm missing something, it seems that -after updating with all "tab"s-, something like `matr + t(matr)` will update the lower triangle? And, then, add 0s to `upper.tri` – alexis_laz Jun 01 '16 at 09:56
  • @alexis_laz apologies - not sure I understand would you mind laying out in a bit more detail, please like in answer that I could accept - why would you t() it? – kutyw Jun 01 '16 at 10:01
  • Both of the "table"s in your example are not symmetric and, also, have duplicate entries with different values -- e.g.in the first table `["ESN", "GWD"] != ["GWD", "ESN"]` and in second `["IBS", "MXL"] != ["MXL", "IBS"]`. In the duplicate entries, will there -always- be one `== 0` and one `!= 0` or could both be `!= 0` and in "matr" their sum should be added? And are your "table"s, indeed, non-symmetric? – alexis_laz Jun 01 '16 at 10:41
  • @alex_laz m1 is not symmetric yes - matr is symmetric. The dupl entries with different values would need to be added together - both could be different from 0,. Yes in matr the sum should be added - but under the diagonal - so the sum of both should be entered in the lower tri of the matrix... – kutyw Jun 01 '16 at 10:44

1 Answers1

1

Taking into account duplicate/non-equal/non-zero entries in each of "table" created through iterations and updating only the lower.tri of "matr":

for(tab in tabs) {
     ## if each 'tab' is large enough, 
     ## instead of creating (and subsetting with) 'row(tab)' and 'col(tab)'
     ##, a 'rep(, each = )' could be used
     i = match(rownames(tab), rownames(mat))[row(tab)]
     j = match(colnames(tab), colnames(mat))[col(tab)]

     ## to fill only the 'lower.tri'
     ii = pmax(i, j); jj = pmin(i, j)

     ## sum duplicate entries 'tab' with 'sparseMatrix's intrinsic 'xtabs'-like behaviour
     ijx = summary(sparseMatrix(ii, jj, x = c(tab)))

     ## subset and assign with a matrix index updating previous entries
     ij = cbind(ijx$i, ijx$j)
     mat[ij] = mat[ij] + ijx$x
}
mat
#  a  b c d e
#a 0  0 0 0 0
#b 4  1 0 0 0
#c 6  7 2 0 0
#d 5 12 5 7 0
#e 4  6 3 3 0

where "tabs" is a "list" containing the -iteratively- created "table"s:

set.seed(007)            
tabs = replicate(3, table(replicate(2, 
                                    sample(letters[1:5], 50, TRUE), simplify = FALSE))[
                                        sample(5, sample(2:5, 1)), sample(5, sample(2:5, 1))], 
                 simplify = FALSE)

and "mat" is a smaller "matr":

mat = matrix(0L, 5, 5, dimnames = replicate(2, letters[1:5], simplify = FALSE))
alexis_laz
  • 12,884
  • 4
  • 27
  • 37
  • but what is mat - how is it from mat? – kutyw Jun 01 '16 at 12:32
  • @kutyw : I used the above "mat" and "tabs" as a more simple/extensive example; replacing "mat" with your "matr" and "tabs" with a "list" of the two "table"s `dput` included in your question and running the above, should result in the wanted/updated "matr" – alexis_laz Jun 01 '16 at 12:39