1

I'm trying to solve the Hackerrank Project Euler Problem #14 (Longest Collatz sequence) using Python 3. Following is my implementation.

cache_limit = 5000001
lookup = [0] * cache_limit
lookup[1] = 1


def collatz(num):
    if num == 1:
        return 1
    elif num % 2 == 0:
        return num >> 1
    else:
        return (3 * num) + 1


def compute(start):
    global cache_limit
    global lookup
    cur = start
    count = 1

    while cur > 1:
        count += 1
        if cur < cache_limit:
            retrieved_count = lookup[cur]
            if retrieved_count > 0:
                count = count + retrieved_count - 2
                break
            else:
                cur = collatz(cur)
        else:
            cur = collatz(cur)

    if start < cache_limit:
        lookup[start] = count

    return count


def main(tc):
    test_cases = [int(input()) for _ in range(tc)]
    bound = max(test_cases)
    results = [0] * (bound + 1)

    start = 1
    maxCount = 1
    for i in range(1, bound + 1):
        count = compute(i)
        if count >= maxCount:
            maxCount = count
            start = i
        results[i] = start

    for tc in test_cases:
        print(results[tc])


if __name__ == "__main__":
    tc = int(input())
    main(tc)

There are 12 test cases. The above implementation passes till test case #8 but fails for test cases #9 through #12 with the following reason.

Terminated due to timeout

I'm stuck with this for a while now. Not sure what else can be done here.

What else can be optimized here so that I stop getting timed out?

Any help will be appreciated :)

Note: Using the above implementation, I'm able to solve the actual Project Euler Problem #14. It is giving timeout only for those 4 test cases in hackerrank.

Bilesh Ganguly
  • 3,792
  • 3
  • 36
  • 58

5 Answers5

1

Yes, there are things you can do to your code to optimize it. But I think, more importantly, there is a mathematical observation you need to consider which is at the heart of the problem:

whenever n is odd, then 3 * n + 1 is always even. 

Given this, one can always divide (3 * n + 1) by 2. And that saves one a fair bit of time...

gregory
  • 10,969
  • 2
  • 30
  • 42
1

Here is an improvement (it takes 1.6 seconds): there is no need to compute the sequence of every number. You can create a dictionary and store the number of the elements of a sequence. If a number that has appeared already comes up, the sequence is computed as dic[original_number] = dic[n] + count - 1. This saves a lot of time.

import time

start = time.time()

def main(n,dic):
    '''Counts the elements of the sequence starting at n and finishing at 1''' 
    count = 1
    original_number = n
    while True:
        if n < original_number:
            dic[original_number] = dic[n] + count - 1 #-1 because when n < original_number, n is counted twice otherwise
            break
        if n == 1:
            dic[original_number] = count
            break
        if (n % 2 == 0):
            n = n/2
        else:
            n = 3*n + 1
        count += 1
    return dic

limit = 10**6
dic = {n:0 for n in range(1,limit+1)}

if __name__ == '__main__':
    n = 1
    while n < limit:
        dic=main(n,dic)

        n += 1        
    print('Longest chain: ', max(dic.values()))
    print('Number that gives the longest chain: ', max(dic, key=dic.get))
    end = time.time()

    print('Time taken:', end-start)
paulanueno
  • 31
  • 2
0

The trick to solve this question is to compute the answers for only largest input and save the result as lookup for all smaller inputs rather than calculating for extreme upper bound.

Here is my implementation which passes all the Test Cases.(Python3)

MAX = int(5 * 1e6)
ans = [0]
steps = [0]*(MAX+1)
 
def solve(N):
    if N < MAX+1:
        if steps[N] != 0:
            return steps[N]
    if N == 1:
        return 0
    else:
        if N % 2 != 0:
            result = 1+ solve(3*N + 1) # This is recursion
        else:
            result = 1 + solve(N>>1) # This is recursion
        if N < MAX+1:    
            steps[N]=result # This is memoization
        return result
    
inputs = [int(input()) for _ in range(int(input()))]
largest = max(inputs)

mx = 0
collatz=1
for i in range(1,largest+1):
    curr_count=solve(i)
    if curr_count >= mx:
        mx = curr_count
        collatz = i
    ans.append(collatz)
    
for _ in inputs:
    print(ans[_])
-1

this is my brute force take:

'
#counter
C = 0
N = 0
for i in range(1,1000001):
n = i
c = 0
while n != 1:
    if n % 2 == 0:
        _next = n/2
    else:
        _next= 3*n+1
    c = c + 1
    n = _next
    if c > C:
    C = c
    N = i

 print(N,C)
-2

Here's my implementation(for the question specifically on Project Euler website):

num = 1
limit = int(input())
seq_list = []
while num < limit:
    sequence_num = 0
    n = num
    if n == 1:
        sequence_num = 1
    else:
        while n != 1:
            if n % 2 == 0:
                n = n / 2
                sequence_num += 1
            else:
                n = 3 * n + 1
                sequence_num += 1

        sequence_num += 1
    seq_list.append(sequence_num)
    num += 1

k = seq_list.index(max(seq_list))
print(k + 1)