Optimising for Programming Challenge (KATTIS)

Question

Working on this problem:

I have been tweaking and timing the various parts of my code (as it is too slow) and I cannot get a decent speed for the final calculation of h_i

h_i, the hash of the adjacency list of vertex i, defined as follows. Suppose the vertices in the out-neighborhood of vertex i are n₁<n₂<⋯<n_di. Then

h_i=7⁰⋅n₁+7¹⋅n₂+7²⋅n₃+⋯+7^di−1⋅n_di

Since hⁱ can be quite large, you should output only the remainder after dividing this number by 10⁹+7.

So I basically have a list for each vertex saying which vertices it is connected to

eg [1,4,12,21] and have to calculate 7⁰ * 1 + 7¹ *4 + 7² *12 + 7³ * 21

Each list can be upto 2000 long and there can be upto 2000 lists to calculate the value for.

Having tried all sorts of things, my best so far is to generate a list of powers of 7 and then use zip, approx 0.25 seconds on a randomly filled 2000*2000 graph. Any better offers appreciated!

import time, random

testgraph = []

for _ in range(2000):
    row = []
    testgraph.append(row)
    
for i in range(2000):
    for j in range(random.randint(500,1800)):
        if i!=j:
            testgraph[i].append(j)
            
MOD = 10**9 + 7
sevens = [1]*2500
for i in range(2499):
    sevens[i+1]=(sevens[i]*7) %MOD
    
t1 = time.time()    
for row in testgraph:
    hashv = 0
    row.sort()
    for a,b in zip(row, sevens):
        hashv +=a*b
        
    hashv = hashv % MOD
t2 = time.time()

print(t2-t1)

Kelly Bundy · Answer 1 · 2022-12-21T13:15:26.580

About 2.6 times faster way (at least on CPython, not sure Kattis uses that):

hashv = sum(map(operator.mul, row, sevens)) % MOD

If Kattis supports NumPy, that could be a lot faster still. Or maybe there's a more clever algorithm that avoids this big calculation altogether. Their "Here is an example of an easy problem that does not even need a description" rather sounds like a joke to me, indicating that it might not be easy at all. But I haven't tried solving it myself yet.

Tested/benchmarked by replacing your timed part with (storing the hashes in expect)

t1 = time.time()    
expect = []
for row in testgraph:
    hashv = 0
    row.sort()
    for a,b in zip(row, sevens):
        hashv +=a*b
    hashv = hashv % MOD
    expect.append(hashv)
t2 = time.time()
print(t2-t1)

and adding mine:

import operator
t1 = time.time()    
result = []
for row in testgraph:
    hashv = sum(map(operator.mul, row, sevens)) % MOD
    result.append(hashv)
t2 = time.time()
print(t2-t1, result == expect)

Sample output (your time, my time, whether we got the same result):

0.6978917121887207
0.2528214454650879 True

Thanks - will give it a whirl. Sadly Kattis does not allow Numpy! — ChlsM1986, Dec 21 '22 at 13:00
@ChlsM1986 And I just checked, they appear to [use PyPy](https://open.kattis.com/help/python3), where my way seems to be a bit *slower* but both take about 0.025 seconds. Not a typo, really an order of magnitude faster. And the time limit is 1 second. So now I suspect your real problem is somewhere else, not in this part of your solution. — Kelly Bundy, Dec 21 '22 at 13:06

Optimising for Programming Challenge (KATTIS)

1 Answers1