4

I have a undirected network where each node can be one of k types. For each node i, I need to calculate the number of neighbors that node i has of each type.

Right now I am representing the edges with an edgelist where the columns are indexes of the nodes. The nodes are represented as a n x k matrix, where each column represents a node type. If a node is of type k then the kth column's value is 1, 0 otherwise.

Here's my current code, which is correct, but too slow.

# example nodes and edges, both typically much longer
nodes = np.array([[0, 0, 1], 
                  [0, 1, 0],                       
                  [1, 0, 0]])
edges = np.array([[0, 1],
                  [1, 2]])

neighbors = np.zeros_like(nodes)

for i, j in edges:
   neighbors[i] += nodes[j]
   neighbors[j] += nodes[i]

Is there some clever numpy that would allow me to avoid this for loop? If the best way to do this is with an adjacency matrix, that is also acceptable.

Divakar
  • 218,885
  • 19
  • 262
  • 358
fgregg
  • 3,173
  • 30
  • 37

2 Answers2

2

If I understand your question correctly, the numpy_indexed package (disclaimer: I am its author) has a fast and elegant solution to this:

# generate a random example graph
n_edges = 50
n_nodes = 10
n_types = 3
edges = np.random.randint(0, n_nodes, size=(n_edges, 2))
node_types = np.random.randint(0, 2, size=(n_nodes, n_types)).astype(np.bool)

# Note; this is for a directed graph
s, e = edges.T
# for undirected, add reversed edges
s, e = np.concatenate([edges, edges[:,::-1]], axis=0).T
import numpy_indexed as npi
node_idx, neighbor_type_count = npi.group_by(s).sum(node_types[e])

In general, operations on graphs, or algorithms involving jagged-arrays, can often be efficiently and elegantly expressed using grouping-operations.

Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42
1

You can simply use np.add.at -

out = np.zeros_like(nodes)
np.add.at(out, edges[:,0],nodes[edges[:,1]])
np.add.at(out, edges[:,1],nodes[edges[:,0]])
Divakar
  • 218,885
  • 19
  • 262
  • 358