0

I have a matrix of pairwise comparisons of all plots in my dataset. Matrix fill represents shared species among plots.

    Plot4   Plot5    Plot6   Plot7   Plot8    Plot9    Plot10      
Plot4 NA    NA       NA       NA      NA         NA       NA
Plot5 0     NA       NA       NA      NA         NA       NA
Plot6 1     0        NA       NA      NA         NA       NA
Plot7 0     0        0        NA      NA         NA       NA
Plot8 0     1        0        0       NA         NA       NA
Plot9 0     1        0        0       2          NA       NA
Plot10 0     0        0        0      1           1       NA

This matrix came from the following dataframe:

  data<-

   region     plot    species
    1          104      A_B  
    1          105      B_C
    1          106      A_B
    1          107      C_D
    2          108      B_C
    2          108      E_F
    2          109      B_C
    2          109      E_F
    2          110      E_F

These plots are associated with certain regions. I generated the following loop that creates this pairwise comparison matrix for all 500 plots:

 plots<-unique(data$plot)
 plot.num<-length(plots)
 output<-matrix(0, plot.num, plot.num) 
  for (i in 1:plot.num)  {
         for (j in 1:plot.num)  {
plot_i<-data[data$plot==plots[i],]
plot_j<-data[data$plot==plots[j],]
output[i,j]<-length(intersect(plot_i$species, plot_j$species))
  }
}

F.mat<-output
F.mat[lower.tri(F.mat, diag=T)]<-0

However, now I want to create a loop that subsets the larger matrix above by region to make a list of regional matrices.

output<-

[[1]] 
   Plot4   Plot5    Plot6   Plot7       
Plot4 NA    NA       NA       NA      
Plot5 0     NA       NA       NA      
Plot6 1     0        NA       NA      
Plot7 0     0        0        NA      

[[2]]       Plot8    Plot9    Plot10      
   Plot8   NA         NA       NA
   Plot9   2          NA       NA
   Plot10  1           1       NA

NOTE: This is a quantitative matrix not presence/absence.

Danielle
  • 785
  • 7
  • 15
  • use dput() on your data.frame, and perhaps edit and start with it, the data, then the matrix, then the loop and resultant matrix, then the list. Generally, starting with dput() data makes things easier to play along, you paste the structure(data stuff here) in. i am not certain, but it looks like the answer, as usual is 42, [link](https://stackoverflow.com/questions/17367277/how-to-extract-intragroup-and-intergroup-distances-from-a-distance-matrix-in-r) – Chris May 31 '17 at 03:01

1 Answers1

0

You could put your evaluation into a function and then lapply over the regions:

countFun <- function(relData){
    plots <- unique(relData$plot)
    plot.num <- length(plots)
    output <- matrix(NA, plot.num, plot.num) 

    if (plot.num > 1){
        for (i in 2:plot.num)  {
            for (j in 1:(i-1))  {
                plot_i <- relData[relData$plot==plots[i],]
                plot_j <- relData[relData$plot==plots[j],]
                output[i,j] <- length(intersect(plot_i$species, plot_j$species))
            }
        }
    }
    output
}

lapply(unique(data$region), function(region) countFun(data[data$region == region,]))

# [[1]]
#      [,1] [,2] [,3] [,4]
# [1,]   NA   NA   NA   NA
# [2,]    0   NA   NA   NA
# [3,]    1    0   NA   NA
# [4,]    0    0    0   NA
# 
# [[2]]
#      [,1] [,2] [,3]
# [1,]   NA   NA   NA
# [2,]    2   NA   NA
# [3,]    1    1   NA
ikop
  • 1,760
  • 1
  • 12
  • 24
  • Thank you for your time. I am getting an error: `Error in `[<-`(`*tmp*`, i, j, value = 0L) : subscript out of bounds` . In the event I have a region with one plot would this cause an error? Second, the name of my real dataframe is B.data. Should I replace `relData` and `data` with `B.data` for this code to work properly? – Danielle May 31 '17 at 16:12
  • Just replace `data` with `B.data` and yes, you are right. A region with just one plot would indeed cause an error. You'd have to check the length inside the function. I'll edit the answer accordingly. – ikop May 31 '17 at 17:45