1

I'm trying to calculate Adamic-Adar similarity for a network, which have two types of nodes. I'm only interested in calculating similarity between nodes which have outgoing connections. Nodes with incoming connections are a kind of connector and I'm not interested in them.

Data size and characteristic:

> summary(g)
IGRAPH DNW- 3852 24478 -- 
+ attr: name (v/c), weight (e/n)

Prototype code in Python 2.7:

import glob
import os
import pandas as pd
from igraph import *

os.chdir("data/")
for file in glob.glob("*.graphml"):
    print(file)
    g = Graph.Read_GraphML(file)
    indegree = Graph.degree(g, mode="in")

    g['indegree'] = indegree
    dev = g.vs.select(indegree == 0)

    m = Graph.similarity_inverse_log_weighted(dev.subgraph())

    df = pd.melt(m)

    df.to_csv(file.split("_only.graphml")[0] + "_similarity.csv", sep=',')

There is something wrong with this code, because dev is of length 1, and m is 0.0, so it doesn't work as expected.

Hint

I have a working code in R, but seems like I'm unable to rewrite it to Python (which I'm doing for the sake of performance, networks are huge). Here it is:

   # make sure g is your network
   indegree <- degree(g, mode="in")
   V(g)$indegree <- indegree
   dev <- V(g)[indegree==0]
   m <- similarity.invlogweighted(g, dev)
   x.m <- melt(m)
   colnames(x.m) <- c("dev1", "dev2", "value")
   x.m <- x.m[x.m$value > 0, ]

   write.csv(x.m, file = sub(".csv",
                             "_similarity.csv", filename))
oski86
  • 855
  • 1
  • 13
  • 35

1 Answers1

1

You are assigning the in-degrees as a graph attribute, not as a vertex attribute, so you cannot reasonably call g.vs.select() later on. You need this instead:

indegree = g.degree(mode="in")
g.vs["indegree"] = indegree
dev = g.vs.select(indegree=0)

But actually, you could simply write this:

dev = g.vs.select(_indegree=0)

This works because of how the select method works:

Attribute names inferred from keyword arguments are treated specially if they start with an underscore (_). These are not real attributes but refer to specific properties of the vertices, e.g., its degree. The rule is as follows: if an attribute name starts with an underscore, the rest of the name is interpreted as a method of the Graph object. This method is called with the vertex sequence as its first argument (all others left at default values) and vertices are filtered according to the value returned by the method.

Tamás
  • 47,239
  • 12
  • 105
  • 124