Working on this problem:
I have been tweaking and timing the various parts of my code (as it is too slow) and I cannot get a decent speed for the final calculation of hi
hi, the hash of the adjacency list of vertex i, defined as follows. Suppose the vertices in the out-neighborhood of vertex i are n1<n2<⋯<ndi. Then
hi=70⋅n1+71⋅n2+72⋅n3+⋯+7di−1⋅ndi
Since hi can be quite large, you should output only the remainder after dividing this number by 109+7.
So I basically have a list for each vertex saying which vertices it is connected to
eg [1,4,12,21] and have to calculate 70 * 1 + 71 *4 + 72 *12 + 73 * 21
Each list can be upto 2000 long and there can be upto 2000 lists to calculate the value for.
Having tried all sorts of things, my best so far is to generate a list of powers of 7 and then use zip, approx 0.25 seconds on a randomly filled 2000*2000 graph. Any better offers appreciated!
import time, random
testgraph = []
for _ in range(2000):
row = []
testgraph.append(row)
for i in range(2000):
for j in range(random.randint(500,1800)):
if i!=j:
testgraph[i].append(j)
MOD = 10**9 + 7
sevens = [1]*2500
for i in range(2499):
sevens[i+1]=(sevens[i]*7) %MOD
t1 = time.time()
for row in testgraph:
hashv = 0
row.sort()
for a,b in zip(row, sevens):
hashv +=a*b
hashv = hashv % MOD
t2 = time.time()
print(t2-t1)