0

I've got two datasets from the same people and I want to compute a correlation for each person over the two datasets.

Example dataset:

dat1 <- read.table(header=TRUE, text="
ItemX1 ItemX2 ItemX3 ItemX4 ItemX5 
5 1 2 1 5 
3 1 3 3 4 
2 1 3 1 3  
4 2 5 5 3 
5 1 4 1 2 
")

dat2 <- read.table(header=TRUE, text="
ItemY1 ItemY2 ItemY3 ItemY4 ItemY5 
4 2 1 1 4 
4 3 1 2 5 
1 5 3 2 2  
5 2 4 4 1 
5 1 5 2 1 
")

Does anybody know how to compute the correlation rowwise for each person and NOT for the whole two datasets?

Thank you!

natash
  • 127
  • 5

2 Answers2

1

One possible solution using {purrr} to iterate over the rows of both df's and compute the correlation between each row of dat1 and dat2.

library(purrr)

dat1 <- read.table(header=TRUE, text="
ItemX1 ItemX2 ItemX3 ItemX4 ItemX5 
5 1 2 1 5 
3 1 3 3 4 
2 1 3 1 3  
4 2 5 5 3 
5 1 4 1 2 
")

dat2 <- read.table(header=TRUE, text="
ItemY1 ItemY2 ItemY3 ItemY4 ItemY5 
4 2 1 1 4 
4 3 1 2 5 
1 5 3 2 2  
5 2 4 4 1 
5 1 5 2 1 
")

n_person = nrow(dat1)

cormat <- purrr::map_df(.x = setNames(1:n_person, paste0("person_", 1:n_person)), .f = ~cor(t(dat1[.x,]), t(dat2[.x,])))
cormat
#> # A tibble: 1 x 5
#>   person_1[,"1"] person_2[,"2"] person_3[,"3"] person_4[,"4"] person_5[,"5"]
#>            <dbl>          <dbl>          <dbl>          <dbl>          <dbl>
#> 1          0.917          0.289         -0.330          0.723          0.913

Created on 2020-11-16 by the reprex package (v0.3.0)

Valeri Voev
  • 1,982
  • 9
  • 25
1

Following that post mentioned by @Ravi, we can transpose the dataframe and then calculate the correlations. One additional step is to vectorise the cor function if you want a not-so-wasteful approach. Consider something like this

tp <- function(x) unname(as.data.frame(t(x)))
Vectorize(cor, c("x", "y"))(tp(dat1), tp(dat2))

Output

[1]  0.9169725  0.2886751 -0.3296902  0.7234780  0.9132660
ekoam
  • 8,744
  • 1
  • 9
  • 22