0

I have been trying to study the impact on a network by looking at deletions of different combinations of nodes.

To study this I have used the networkx graph theory metric, global efficiency. But, I figured that the networkx code ignores weight when calculating global efficiency. So, I went in and changed the source code and added weight as a metric. It seems to be working and is giving me different values than the non-weighted approach but is exceptionally slow (about 20 times).

How can I speed up these computations?

##The code I am running

import networkx
import numpy as np
from networkx import algorithms 
from networkx.algorithms import efficiency 
from networkx.algorithms.efficiency import global_efficiency
import pandas



data=pandas.read_csv("ones.csv")
lol = data.values.tolist()
data=pandas.read_csv("twos.csv")
lol2 = data.values.tolist()

combo=[["10pp", "10d"]]
GE_list=[]


for row in combo:
    values = row
    datasafe=pandas.read_csv("b1.csv", index_col=0)
    datasafe.loc[values, :] = 0

    datasafe[values] = 0


    g=networkx.from_pandas_adjacency(datasafe)
    ge=global_efficiency(g)
    GE_list.append(ge)

extra=[""]
extra2=["full"]
combo.append(extra)
combo.append(extra2)
datasafe=pandas.read_csv("b1.csv", index_col=0) 
g=networkx.from_pandas_adjacency(datasafe)
ge=global_efficiency(g)
GE_list.append(ge)

values = ["s6-8","p9-46v","p47r","p10p","IFSp","IFSa",'IFJp','IFJa','i6-8','a9-46v','a47r','a10p','9p','9a','9-46d','8C','8BL','8AV','8AD','47s','47L','10pp','10d','46','45','44']
datasafe=pandas.read_csv("b1.csv", index_col=0)
datasafe.loc[values, :] = 0

datasafe[values] = 0


g=networkx.from_pandas_adjacency(datasafe)
ge=global_efficiency(g)
GE_list.append(ge)

output=pandas.DataFrame(list(zip(combo, GE_list)))
output.to_csv('delete 1.csv',index=None)


##The change I made to the original networkx code
    try:
        eff = 1 / nx.shortest_path_length(G, u, v)
## changed to
    try:
        eff = 1 / nx.shortest_path_length(G, u, v, weight='weight')

Previously with my unweighted graphs I was able to process my data in 2 hours, currently its taking the same time to do a twentieth of the data. Please do suggest any improvements to my code or any other pieces of code that I can run.

Ps-I don't have a great understanding of python, so please do bear with me :)

1 Answers1

0

Using weights, you exchange breadth-first search with Dijkstra algorithm, which increases the runtime by log|V|, see second comment of https://stackoverflow.com/a/25449911

If you have problem with the runtime, you should rather exchange networkx, which is implemented in python, with a C implementation like graph-tool or igraph, see e.g. for a (probably biased) comparison of performance: https://graph-tool.skewed.de/performance

Sparky05
  • 4,692
  • 1
  • 10
  • 27