Distinguishing random gnp graphs from preferential attachment grpahs using the powerlaw python package

Question

My goal is to find the point where scale-free networks become indistinguishable from random (non-scale-free) networks using the powerlaw python package

As stated in their paper one should determine the goodness of a power-law fit always by comparing it to the fit to another distribution.

I would expect something like the binomial distribution to be implemented for comparison of the goodness of fit but it's not.

For example I tried the following code to distinguish between an obviously scale-free network and an obviously non-scale-free network (both with similar numbers of nodes/edges):

non_sf_graph = nx.gnp_random_graph(10000, 0.002)
sf_graph = nx.barabasi_albert_graph(10000, 10)
fitpl = powerlaw.Fit(list(sf_graph.degree().values()))
fitnpl = powerlaw.Fit(list(non_sf_graph.degree().values()))

for dist in fitpl.supported_distributions.keys():
    print(dist)
    fitpl.distribution_compare('power_law', dist)
    fitnpl.distribution_compare('power_law', dist)

The output suggested that none of the implemented distributions provided a tool to discern between an preferential attachment model and a gnp random graph:

lognormal
(-0.23698971255249646, 0.089194415705275421)
(-20.320811335334504, 3.9097599268295484e-92)
exponential
(511.41420648854108, 7.3934851812182895e-23)
(24.215231521373582, 3.7251410948652104e-08)
truncated_power_law
(3.3213949937049847e-06, 0.99794356568650555)
(3.1510369047360598e-07, 0.99936659460444144)
stretched_exponential
(16.756797270053454, 1.6505119872120265e-05)
(8.7110005915424153, 8.7224098659112012e-05)
lognormal_positive
(30.428201968820289, 1.7275238929002278e-07)
(6.7992592335974233, 5.4945477823229749e-06)

(sign of first value indicates whether first (positive) or second (negative) distribution is a better fit, second value is the p-value for the significance of that decision)

Am I going at this problem from the wrong angle, or should I implement the binomial distribution myself?

I am asking as i am no statistics expert and I might not see the significance of all the available distributions. But they seem to fail this basic example.

Distinguishing random gnp graphs from preferential attachment grpahs using the powerlaw python package

0 Answers0