0

I have a series of data frames, each with an individual identifier (in this example a letter A-E), and the site number it was observed at.

In this example, I have 3 data frames:

Letters<-c("A","B","C","D","E")

Site1<-c(1,1,2,2,2)
Site2<-c(10,10,20,30,30)
Site3<-c(17,27,37,47,57)

Df1<-data.frame(Letters, Site1)
Df2<-data.frame(Letters, Site2)
Df3<-data.frame(Letters, Site3)

For the first one, it ends up looking like this:

Df1
  Letters Site
1       A     1
2       B     1
3       C     2
4       D     2
5       E     2

Individuals A and B were found at Site 1, and individuals C,D,and E were found at site 2.

I'm looking for a way to track which individuals are found within the same sites within a single dataframe (note the site numbers change each time, so I only care about within-dataframe groupings).

I'm assuming I would create individual co-occurrence matrix, with each single matrix only having a 1 or a 0 indicating whether an individual overlapped. Then the last step would be just to add them up like so:

DF1 co-occurrence

   A B C D E
A  1 1 0 0 0
B  1 1 0 0 0
C  0 0 1 1 1
D  0 0 1 1 1
E  0 0 1 1 1 

DF2 co-occurrence

   A B C D E
A  1 1 0 0 0
B  1 1 0 0 0
C  0 0 1 0 0
D  0 0 0 1 1
E  0 0 0 1 1 

DF3 co-occurrence

   A B C D E
A  1 0 0 0 0
B  0 1 0 0 0
C  0 0 1 0 0
D  0 0 0 1 0
E  0 0 0 0 1 

And then add them up to see who is most often grouped with whom:

   A B C D E
A  3 2 0 0 0
B  2 3 0 0 0
C  0 0 3 1 1
D  0 0 1 3 2
E  0 0 1 2 3 

But I'm not sure how to implement this kind of workflow in R, or if this is even the best way to approach this problem. But my hope is to end up with a similar matrix to this last one above, or some similar method to quantify total co-occurrence

Vint
  • 413
  • 6
  • 17
  • 1
    Bind your data frames together and calculate the crossproduct of the `Letters` x `Site` table. Simplified to illustrate: `tcrossprod(table(rep(Letters, 3), c(Site1, Site2, Site3)))`. – Ritchie Sacramento Nov 15 '22 at 06:15

0 Answers0