1

I'm trying to calculate different network statistics with the following (demo) data

BLM NPS USFWS USFS Bureau of Reclamation
BLM 10 1 8 2 1
NPS 1 3 2 0 0
USFWS 8 2 22 5 2
USFS 2 0 5 5 0
Bureau of Reclamation 1 0 2 0 2

When it came to running network statistics I've run into a couple problems.

  • The first part of my question is why the following igraph objects have different values for calculating average weighted degree (the average # of edges connected to each node / total number of nodes)
library(igraph)
library(reshape2)
diag(mat) <- NA #remove diagonal (this was done because to creating an adjacency matrix from edge & node list "igraph[]" leaves the diagonal blank. Source - http://pablobarbera.com/big-data-upf/html/02a-networks-intro-visualization.html

salfTmatlist <- mat %>% melt() #change into list

#create igraph objects
salfTnet <- graph_from_data_frame(d=salfTmatlist, directed=T) 
Partner_Network <- graph_from_adjacency_matrix(mat)

#calculate average weighted degree
mean(degree(salfTnet), mode = "total") 
mean(degree(Partner_Network), mode = "total")  

#checking calculations by hand
added <- rowSums(mat, na.rm = TRUE) #this gives the correct number of edges across each node
mean(added)

The check calculation seems to be the same as the calculation done with the igraph object set from the adjacency matrix but I don't know why setting the igraph object from the same data with different functions makes a difference in the calculation.

  • For calculating modularity, I can't check my work and I get different answer here again
modularity(Partner_Network, V(Partner_Network))

modularity(salfTnet, V(salfTnet))

Any thoughts on the following:

  • why there are different values from functions being applied to the same data?
  • Which calculations are correct? (based on the check it seems like I should stick to the igraph object generated from "graph_from_adjacency_matrix" but I'm not sure)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
as365
  • 41
  • 4
  • What is the interpretation of this matrix you are starting with? What are the values in the off diagonal supposed to represent? – MrFlick Dec 22 '20 at 02:36

1 Answers1

0

With your data

mat <- matrix(c(10L, 1L, 8L, 2L, 1L, 1L, 3L, 2L, 0L, 0L, 8L, 2L, 
  22L, 5L, 2L, 2L, 0L, 5L, 5L, 0L, 1L, 0L, 2L, 0L, 2L), 
  nrow = 5, ncol=5, 
  dimnames = list(
    c("BLM", "NPS", "USFWS", "USFS", "Bureau of Reclamation"), 
    c("BLM", "NPS", "USFWS", "USFS", "Bureau of Reclamation"))
)
diag(mat) <- NA 

It's a bit unclear how you are meant to interpret the off diagonal values. With an adjacently matrix you usually just have 0/1 values there. I'll assume that if there is a non-zero value there, then they are connected. So if you are melting the matrix into a data frame, you only want to make connections for rows that have a non-zero value there, so you need to filter your data.frame

salfTmatlist <- mat %>% melt()
salfTnet <- graph_from_data_frame(d=subset(salfTmatlist, value>0), directed=T) 

Otherwise the default for graph_from_data_frame is just to create one edge for each row and it will track the extra column as an edge attribute. So you get edges for rows that have zeros.

And then for your adjacency matrix version, you just need to set all those off-diag values to be zero or one. You can do that with just a comparison against zero

Partner_Network <- graph_from_adjacency_matrix(mat>0)

If you do leave in numbers other than 0/1, igraph will create n edges between from i to j where mat[i,j]=n. Rather than repeating edges, you could also use those values as weights, but the default is to create redundant edges.

With these two versions, both return the same mean degree count

mean(degree(salfTnet), mode = "total") 
# [1] 5.6
mean(degree(Partner_Network), mode = "total")  
# [1] 5.6
MrFlick
  • 195,160
  • 17
  • 277
  • 295