I want to compute closeness centrality measure on a network with disconnected components. closeness
function in igraph
does not give meaningful results on such graphs. (see)
Then I came accross this site where it is explained that closeness can be measured on graphs with disconnected components as well.
The following code is what is suggested to achieve this:
# Load tnet
library(tnet)
# Load network
# Node K is assigned node id 8 instead of 10 as isolates at the end of id sequences are not recorded in edgelists
net <- cbind(
i=c(1,1,2,2,2,3,3,3,4,4,4,5,5,6,6,7,9,10,10,11),
j=c(2,3,1,3,5,1,2,4,3,6,7,2,6,4,5,4,10,9,11,10),
w=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1))
# Calculate measures
closeness_w(net, gconly=FALSE)
In my case, I have a transaction data, so the network I build on this data is directed
and weighted
. Weights consist of 1/(transaction amount)
.
This is my example data:
structure(list(id = c(2557L, 1602L, 18669L, 35900L, 48667L, 51341L
), from = c("5370", "6390", "5370", "5370", "8934", "5370"),
to = c("5636", "5370", "8933", "8483", "5370", "7626"), date = structure(c(13099,
13113, 13117, 13179, 13238, 13249), class = "Date"), amount = c(2921,
8000, 169.2, 71.5, 14.6, 4214)), row.names = c(NA, -6L), class = "data.frame")
I use the following code to achieve what I want:
df2 <- select(df,c(from,to,amount)) %>%
group_by(from,to) %>% mutate(weights=1/sum(amount)) %>% select(-amount) %>% distinct
network <- cbind(df2$from,df2$to,df2$weights)
cl <- closeness_w(network, directed = T, gconly=FALSE) # here it gives the error: "Error in net[, "w"]^alpha : non-numeric argument to binary operator"
# so I modify from and to columns as follows to solve the error mentioned above
df2$from <- as.integer(df2$from)
df2$to <- as.integer(df2$to)
# then I run the code again
network <- cbind(df2$from,df2$to,df2$weights)
cl <- closeness_w(network, directed = T, gconly=FALSE)
However the output is not like the one on the website that is only consisting closeness scores for each node, instead it created so many rows with 0 value, I dont know why.
The output I got is as follows:
node closeness n.closeness
[1,] 1 0.00000000 0.000000000000
[2,] 2 0.00000000 0.000000000000
[3,] 3 0.00000000 0.000000000000
[4,] 4 0.00000000 0.000000000000
[5,] 5 0.00000000 0.000000000000
...........................................................
[330,] 330 0.00000000 0.000000000000
[331,] 331 0.00000000 0.000000000000
[332,] 332 0.00000000 0.000000000000
[333,] 333 0.00000000 0.000000000000
[ reached getOption("max.print") -- omitted 8600 rows ]
Also, inputs in i
and j
columns in the data given on the website are reciprocal that is 1->2 exists iff 2->1 exists. But my data is not like that, so in my data 5370
sent money to 5636
, but 5636
haven't sent any money to 5370
. So, how can I compute closeness measure correctly on such directed network of transaction data. Is there anyone that tried a similar computation before?
EDIT: Since the weights are not considered as distance in
closeness_w
function, but rather they are considered as strength, I should have determinedweights
assum(amount)
instead of1/sum(amount)