0

I have two different lists (a and b) containing 626257 vectors, each vector containing 44 numeric entries. One list contains sample data and the other list serves as a reference. Now I want to calculate the pearson correlation between all the entries of both lists. I store the values in a variable (r).

The output of "r" does unfortunately only contain "NA" entries.

Here the code to generate two dummy lists.

a = replicate(626257,rep(10,44),simplify = FALSE)
b = replicate(626257,rep(3,44),simplify = FALSE)

And here the code to calculate the correlation.

r = lapply(seq_along(a), function(ind)cor(a[[ind]], b[[ind]]))
View(r)
stefx
  • 25
  • 10
  • 2
    Your vectors will have zero variance, try replacing them with random data, `rnorm(44)` for example. – AkselA Aug 04 '19 at 18:20
  • What is your desired result? Do you want 626257 values or do you want one value (the correlation between all values in `a` and all values in `b`? – Brigadeiro Aug 04 '19 at 18:24
  • I can not replace the 44 value by rnorm(44). Because I want to have 44 entries per vector. But I tried this: rnorm(10) and rnorm(3). But now I have 440 and 132 entries per vector. I do not understand why. I want random numbers within the vectors but the number (626257) and length (44) of the vectors should not change. In the end I would like to have 626257 correlation values. – stefx Aug 04 '19 at 18:33
  • The code in my answer (which includes `rnorm(44)`) creates a vector with 44 normally distributed values - it is what you want. – Brigadeiro Aug 04 '19 at 18:38

1 Answers1

0

You can use mapply to complete this task pretty easily. As AkselA pointed out, you first need to simulate data with some variance (using rnorm(44) instead of rep, 10, 44) for example). See below:

a <- replicate(626257,rnorm(44),simplify = FALSE)
b <- replicate(626257,rnorm(44),simplify = FALSE)
r <- mapply(cor, a, b)
Brigadeiro
  • 2,649
  • 13
  • 30