How to compute the total number of combinations between three datasets?

Question

I have three datasets and would like co know how much N was used in my calculations.

I read the data into a multi-dimensional array with dimensions (nx, ny, ntsteps, ndatasets), e.g. with a smaller example dataset:

      # nx   ny   nsteps ndatasets
            dat = runif(20 * 30 * 100 *  3)
        dim(dat) = c(20, 30, 100, 3)
        > str(dat)
    num [1:20, 1:30, 1:100, 1:3] 0.1834 0.8537 0.0672 0.0734 0.8078 ...

we take advantage of the cor functions and build this function to compute how many N we have:

    cor_withN <- function(...) {
        res <- try(cor.test(...)$parameter+2, silent=TRUE)
        ifelse(class(res)=="try-error", NA, res)}

Now we take advantage of the fact that apply also works on multi-dimensional arrays, not only matrices:

We use apply to iterate over all the x,y,z triples.

      result = apply(dat, c(1,2), function(x) cor_withN(x[,1], x[,2],x[,3]))
     > str(cor_result)
     logi [1:20, 1:30] NA NA NA NA NA NA ..

so something is wrong by getting NA NA NA NA if the last line went well! then

     str(cor_result)

should be

     logi [1:20, 1:30] 100 100 100 100 100 ..(nsteps)

Any idea on why I am getting NA or is there another way to do it?

When I tested it with 2 datsets,it went well!

      cor_result = apply(dat, c(1,2), function(x) cor_withN(x[,1], x[,2]))
      > str(cor_result)
     num [1:20, 1:30] 100 100 100 100 100 100 100 100 100 100

so the problem is when I added x[,3]!! Thanks

I'm sorry, but cor.test as far I know takes two arguments, but you are passing it three arguments. How is that supposed to work? — January, Jul 15 '13 at 17:37
I am trying but I am not sure if we can make it work. Is there any other way to compute the N of triples like cor.test for three arguments!! — sacvf, Jul 15 '13 at 17:38
I'm not sure what you mean by "N of triples". Total number of combinations? There are simpler ways than calculating the correlation coefficient for that. — January, Jul 15 '13 at 17:48
yes I meant the Total number of combinations(the length of the vector) — sacvf, Jul 15 '13 at 17:52
I'm not entirely sure what you are trying to achieve here. `cor.test` is a function to test the significance of the correlation coefficient between two vectors of the same length. The `parameter` in the result is the number of degrees of freedom which is always equal to the length of the vector minus 2. How is that supposed to have anything to do with the number of combinations? (what combinations?) Please give some examples along with what you expect the function to return. — January, Jul 15 '13 at 17:52
What are the other ways to do this(the length of the vector)? — sacvf, Jul 15 '13 at 17:53
I do not fully understand what the OP is asking for, but regarding to the thread title the function `expand.grid` may be of use to him. Example: `expand.grid(c(1,2,3),c(4,5,6),c(7,8,9))`. — cryo111, Jul 15 '13 at 18:11

score 2 · Accepted Answer · edited May 23 '17 at 12:11

2

Using this , you can do the following for example:

corpij <- function(i,j,data) {
             res <- tryCatch(cor.test(data[,i],data[,j])$parameter+2,
                    error = function(e) NA)

corp <- Vectorize(corpij, vectorize.args=list("i","j"))
result = apply(dat, c(1,2), 
               function(x) outer(1:ncol(x),1:ncol(x), corp,data=x))

outer will perform all the columns combinations.

edited May 23 '17 at 12:11

Community

1
1

answered Jul 15 '13 at 17:55

agstudy

119,832
17
199
261

Thanks that worked well but the `dim` of the results was not correct. Look at this`str(result) num [1:9, 1:20, 1:30] 100 100 100 100 100 100 100 100 100 100 ...` – sacvf Jul 15 '13 at 18:05
the dim of the results was not correct. Look at this`str(result)` `num [1:9, 1:20, 1:30] 100 100 100 100 100 100 100 100 100 100`` – sacvf Jul 15 '13 at 18:14
Where the dim `[1:9` came from? – sacvf Jul 15 '13 at 18:14
you give `outer`, data=x , where x is matrix(100x3), and you compute correlation for all combinations of columns , you have 3 columns , so , 3^3 =9 (expand.grid(1:3,1:3)).... you compute even the autocorrelation (corr(x[,1],x[,1]). Hope it is clear. – agstudy Jul 15 '13 at 18:29
It is a matrix of `(20*30)` not `(100x3)` – sacvf Jul 15 '13 at 18:32
@sacvf NO! `dim(dat) = c(20, 30, 100, 3)` you pick up `c(1,2)` in your `apply`, so the residuals dimensions are the 3 and 4 , so `100x3`. Wherever, here **3** the number of columns is the dimensions that give 9. – agstudy Jul 15 '13 at 18:36
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/33504/discussion-between-agstudy-and-sacvf) – agstudy Jul 15 '13 at 18:38
I got this error:`Erreur dans cor.test.default(data[, i], data[, j]) : not enough finite observations`. Could you kindly add something in the function to take care of this problem. – sacvf Jul 15 '13 at 18:39
1

Good grief. +1 for patience! – Simon O'Hanlon Jul 15 '13 at 20:37
In fact, the dim of the results should be as the same as the `dim` of the original file.`dim` of `dat` `20 rows and 30 columns`. but the dim of `result` is `9 rows and 20 columns`. In fact what I meant is to keep the `dim` of `dat` as it is and just replace its values with total number of combinations. just type `dat` and then enter and look at the output.do the same for `result` and you will see what i mean!! – sacvf Jul 16 '13 at 12:33

How to compute the total number of combinations between three datasets?

1 Answers1