TL;DR
The (main, IMO) problem comes from the line length_of_sequence += len(sequence)
, because it should be length_of_sequence = len(sequence)
, and from the fact that you don't reset sequence
at each iteration.
However there are multiple other bugs. Here is a correct version.
def solution(x):
d = {}
for i in range(2, x+1):
sequence = []
while i != 1:
sequence.append(i)
if i % 2 == 0: i //= 2
else: i = 3*i + 1
sequence.append(1)
length_of_sequence = len(sequence)
d[i] = length_of_sequence
return max(d, key=d.get)
Here is a better (faster) way to do this, with recursion.
def solution(x):
d = {1: 1}
def collatz(n):
if n not in d: d[n] = (collatz(n//2) if n % 2 == 0 else collatz(3*n+1)) + 1
return d[n]
return max(range(1, x+1), key=collatz)
Explanation
Let's start from your code, and refactor progressively to the correct one, then continue to the faster one.
def solution(x):
sequence = []
d = {}
length_of_sequence = 0
for i in range (x, 1, -1):
while(i != 1):
sequence.append(i)
if(i % 2 == 0):
i = i // 2
else:
i = 3 * i + 1
sequence.append(1)
length_of_sequence += len(sequence)
d.update({i: length_of_sequence})
return max(d, key=d.get)
patching sequence
Notice that you initialize sequence
only once, but you keep appending to it, meaning that on the second iteration, it already contains elements from the first iteration, and thus its length is strictly increasing. Since you end with 2
, the maximum will probably be 2
.
The patch is easy, just recreate the sequence for each i
you are looking at.
def solution(x):
d = {}
length_of_sequence = 0
for i in range (x, 1, -1):
sequence = []
while(i != 1):
sequence.append(i)
if(i % 2 == 0):
i = i // 2
else:
i = 3 * i + 1
sequence.append(1)
length_of_sequence += len(sequence)
d.update({i: length_of_sequence})
return max(d, key=d.get)
patching length_of_sequence
The previous patch, however, has no effect because you still carry the length of sequence from one iteration to the next, so you should initialize it at each iteration, and update it only once you have finished completing the sequence.
I'll also transform
length_of_sequence = 0
length_of_sequence += len(sequence)
into
length_of_sequence = len(sequence)
def solution(x):
d = {}
for i in range (x, 1, -1):
sequence = []
while(i != 1):
sequence.append(i)
if(i % 2 == 0):
i = i // 2
else:
i = 3 * i + 1
sequence.append(1)
length_of_sequence = len(sequence) # This has been de-indented
# because we are not interested
# in the temporary length of the sequence,
# but only in the final one.
d.update({i: length_of_sequence})
return max(d, key=d.get)
patching append
Currently, you are append
ing 1
each time you append a new element to the sequence, so the result is 4 -> 1 -> 2 -> 1
. You only want to append it once, at the end.
def solution(x):
d = {}
for i in range (x, 1, -1):
sequence = []
while(i != 1):
sequence.append(i)
if(i % 2 == 0):
i = i // 2
else:
i = 3 * i + 1
sequence.append(1)
length_of_sequence = len(sequence)
d.update({i: length_of_sequence})
return max(d, key=d.get)
Polishing code
Now the code is correct, but it's ugly, meaning it's not idiomatic. I'm not going to argue here about why and how your code should be idiomatic (you can google it if you are curious or not convinced), so I'll assume you want to have idiomatic code (because everybody should want to).
Parenthesis around boolean conditions
We are not in C
, or C
-like languages, here boolean conditions in while
and for
should not have parenthesis around.
def solution(x):
d = {}
for i in range(x, 1, -1):
sequence = []
while i != 1:
sequence.append(i)
if i % 2 == 0:
i = i // 2
else:
i = 3 * i + 1
sequence.append(1)
length_of_sequence = len(sequence)
d.update({i: length_of_sequence})
return max(d, key=d.get)
Dictionary update
You are using an overkill way to simply add a value to a dictionary. This should be d[i] = length_of_sequence
.
def solution(x):
d = {}
for i in range(x, 1, -1):
sequence = []
while i != 1:
sequence.append(i)
if i % 2 == 0:
i = i // 2
else:
i = 3 * i + 1
sequence.append(1)
length_of_sequence = len(sequence)
d[i] = length_of_sequence
return max(d, key=d.get)
Assignment with operation shorthand
i = i // 2
can be shortened in i //= 2
.
def solution(x):
d = {}
for i in range(x, 1, -1):
sequence = []
while i != 1:
sequence.append(i)
if i % 2 == 0:
i //= 2
else:
i = 3 * i + 1
sequence.append(1)
length_of_sequence = len(sequence)
d[i] = length_of_sequence
return max(d, key=d.get)
One-lining short blocks
This is more of a personal preference, but since some of your blocks are really short, they could be one-lined.
def solution(x):
d = {}
for i in range(x, 1, -1):
sequence = []
while i != 1:
sequence.append(i)
if i % 2 == 0: i = i // 2
else: i = 3 * i + 1
sequence.append(1)
length_of_sequence = len(sequence)
d[i] = length_of_sequence
return max(d, key=d.get)
In-order range
There is no point in doing the loop in reverse order, so let's not!
def solution(x):
d = {}
for i in range(2, x+1):
sequence = []
while i != 1:
sequence.append(i)
if i % 2 == 0: i = i // 2
else: i = 3 * i + 1
sequence.append(1)
length_of_sequence = len(sequence)
d[i] = length_of_sequence
return max(d, key=d.get)
Performance
And, voilà! We have reached the correct version. Now, that we have a correct and idiomatic code, we can start having a faster one. This is not useful for small values of x
, of course, so if you don't feel the current code is slow, you probably don't need this improvement. However, you might still be curious on how to improve it. Here we go!
A bottleneck
The problem with the current code, is that it doesn't take advantage of the fact that, essentially, computing the length of the sequence from a starting point is a pure and deterministic function. I'm not going to explain what it means, but simply state that it provides a way to fasten the code.
The idea is the following. When I am computing the length of the sequence starting with 4
(which is 4 -> 2 -> 1
), I also have to computer its sequence. However, its sequence (4 -> 2 -> 1
) also contains strictly the sequence which starts with 2
(which is 2 -> 1
), so if I did the computations in the right order, I should compute first the length of the sequence of 2
, then add one to get the length of the sequence of 4
.
This has two advantages. First, I only do the heavy work for each number once. Secondly, I don't actually need to store the sequence, I can work directly with the length of the sequence (this optimization could have been done with your version too).
To do so, we'll store the length of the sequence starting at n
in a dictionary once we will have computed it.
We will recursively compute the length of the sequence starting at x
by adding one to the length of the sub-sequence starting at the next number. But, we will, before, check we haven't already computed it. Finally, do this for every i
between 1
and x
.
def solution(x):
d = {1: 1}
def collatz(n):
if n not in d: d[n] = (collatz(n//2) if n % 2 == 0 else collatz(3*n+1)) + 1
return d[n]
return max(range(1, x+1), key=collatz)
EDIT: Applied suggestion from @dont-talk-just-code.