Segmentation Fault Error while doing big calculations in python

Question

I want to calculate collatz sequence for some very big number but i guess python is failing to handle this much big numbers and i don't know how to make it handle so.

Here's my program:

def collatz(n):
    if (n == -1 or n == 1 or n == -17 or n == -17 -2**4096):
        print('break found',n)
        return
    if str(n)[-1] in ['1','3','5','7','9']:
        #print(n)
        return collatz(3*n + 1)
    else:
        return collatz(n//2)

I want to use n = 2**4096 ranged numbers. I increased recursion limit by sys.setrecursionlimit function. But now i'm facing Segmentation fault error.

>>> sys.setrecursionlimit(10**9)
>>> collatz(2**1000 + 1)
break found: 1
>>> collatz(2**4000 + 1)
Segmentation fault (core dumped)

Please give me some suggestions regarding whatever i need to modify in order to achieve big input support..

don't use recursion (you don't have a stack that's going to be large enough). Also, you have one branch that doesn't return a value... — thebjorn, May 29 '21 at 18:32
Also, instead of `str(n)[-1] in ['1','3','5','7','9']` use `n % 2 == 1` - converting large integers to string can be slow (although `2**4096` isn't especially large. — thebjorn, May 29 '21 at 18:34

Arty · Accepted Answer · 2021-05-30T09:39:50.903

2

Make it non-recursive, too deep recursion overflows stack and stack is usually just few megabytes. After stack overflow your program crashes with segmentation fault.

Your code modified to become non-recursive (which doesn't crash):

Try it online!

def collatz(n):
    while True:
        if (n == -1 or n == 1 or n == -17 or n == -17 -2**4096):
            print('break found', n)
            return
        if str(n)[-1] in ['1','3','5','7','9']:
            #print(n)
            n = 3 * n + 1
        else:
            n = n // 2


collatz(2**4000 + 1)

Output:

break found 1

BTW, classical Collatz problem can be solved with much shorter and faster code, for example like this:

Try it online!

def collatz(n):
    for i in range(1 << 50):
        if n == 1:
            return i
        n = 3 * n + 1 if n & 1 else n >> 1

print('Collatz chain length:', collatz(2**4000 + 1))

Output:

Collatz chain length: 29400

Also just for a side note I want to mention Python library GMPY2 based on famous C-based GMP. It has very optimized long integer arithmetics code and can be used to boost your code if you realy need speed.

On Windows gmpy2 can be installed by downloading it from here and installing through pip install gmpy2‑2.0.8‑cp39‑cp39‑win_amd64.whl. On Linux it can be installed through sudo apt install python3-gmpy2.

After installation you can use gmpy2 in a very simple manner, like in function collatz_gmpy() below:

Try it online!

def collatz_py(n):
    for i in range(1 << 50):
        if n == 1:
            return i
        n = 3 * n + 1 if n & 1 else n >> 1

def collatz_gmpy(n):
    from gmpy2 import mpz
    n = mpz(n)
    for i in range(1 << 50):
        if n == 1:
            return i
        n = 3 * n + 1 if n & 1 else n >> 1

def test():
    import timeit
    n = 2 ** 100000 + 1
    for i, f in enumerate([collatz_py, collatz_gmpy]):
        print(f.__name__, round(timeit.timeit(lambda: f(n), number = 1), 3), 'secs')

test()

Output:

collatz_py 7.477 secs
collatz_gmpy 2.916 secs

As one can see GMPY2 variant gives 2.56x times speedup compared to regular Python's variant.

edited May 30 '21 at 09:39

answered May 29 '21 at 18:35

Arty

14,883
6
36
69

I don't know anything about the math here... why are the two answers different (is the OP algorithm incorrect)? – thebjorn May 29 '21 at 19:09
FWIW, the second version runs in 1.518 secs on my pc, when calculating `collatz(2**40000-1)` -- although the result is different. – thebjorn May 29 '21 at 19:13
@thebjorn Answers are different because I output a different thing in my second code. In my second code I solve a classical [Collatz problem](https://cutt.ly/knsqBsL) just for example, as it is usually solved by many people. In my second code I return number of loop iterations, this number is equal to length of Collatz chain, because algorithm produces always a chain, with length different for different inputs. OP's algorithm just outputs `1` always,it outputs not the length of chain but very last element of chain, and very last is always `1` (all other negative values like `-17` never happen) – Arty May 30 '21 at 03:31
@Arty what does `for i in range(1 << 50):` do? Is it for getting a very big limit? – Vicrobot May 30 '21 at 07:33
@Arty i found on wiki that there are 5 cycles known, 1, -1, -5, -17, and 0 in collatz sequence. – Vicrobot May 30 '21 at 07:51
1

@Vicrobot Yes, 1<<50 is just a big limit, it is a limit for chain length, such long chains never appear in practice. Of cause you can use infinity here in this place, but having infinity range needs `import itertools`, I just wanted to simplify things by random big limit. – Arty May 30 '21 at 08:23
@Vicrobot All collatz values are always positive >=1, it means you never get -1, -5, -17. This negative values are not for `n`, this negative values are relative values to something, maybe for example to `2^K - 17` for some K (i.e. relative to power of 2). But in your code you're comparing `n` directly to be equal to negative values, which never happens. Please give us this Wiki link about 5 known cycles, I'm interested what it actually means. Also Collatz chains never have cycles, at least noone yet found any cycle, meaning that every starting `n` will give finally `1`, chain length is finite – Arty May 30 '21 at 08:26
@Vicrobot Just for fun I decided to implement a faster and more optimal variant of Collatz chain computation, I just updated right now my answer, see end of my answer, there I placed GMPY2-library-based solution that gives `2.56x` times speedup compared to regular python variant. – Arty May 30 '21 at 09:43
@thebjorn I see you're fond of measuring speeds, then you may see end of my answer, I updated it right now. I added there GMPY2-library-based code that speeds-up things even more, `2.5x` times more compared to regular python code. – Arty May 30 '21 at 09:45
@Arty ok so i didn't tell you that im using other domains(negative and 0 integers) rather than just usual positive integers specified in classical collatz conjucture, thats why im getting negative values. [Here](https://en.wikipedia.org/wiki/Collatz_conjecture#Iterating_on_all_integers) is the link. Also thanks for interesting speedups. – Vicrobot May 30 '21 at 12:43
I would like to know why did python makers decided to specify small size of recursion stack. Is it because specifying size of stack was necessary in C language? – Vicrobot May 30 '21 at 12:46
Also, what is this library doing here to improve the speed? – Vicrobot May 30 '21 at 13:18
@Vicrobot Recursion depth is limited by stack size anyway. Even if you modify depth of recursion, still you'll have small stack (few megabytes), which will result in crash and segmentation fault like you had. And I think stack size is not that easy ot modify. Probably by having not-to-big recursion depth designers of Python wanted to encourage programmers not to rely in their algos on deep recursion, but use loops instead or use list to model stack memory. – Arty May 30 '21 at 14:06
@Vicrobot This library GMPY2 does two things 1) It uses Fast Fourier Transform for multiplying large enough numbers (after some size threshold), which improves speed greatly, but regular Python probably also uses Fourier. 2) It uses [SIMD](https://en.wikipedia.org/wiki/SIMD) instructions (like SSE/AVX/AVX512), and these SIMD instructions are probably not used by regular Python, hence you have speedup of 2-4x times, which is a common amount of speedup for SIMD. – Arty May 30 '21 at 14:08
@Vicrobot Understood, so you have non-classical Collatz with negative values, yes then your algorithm is alright, also in my optimized algo you may write `if n in (1, -1, -17):` instead of just `if n == 1:`, this is enough for my code to support negative values. Also probably instead of `n & 1` and `n >> 1` you have to use less optimal `n % 2` and `n // 2` to support negative values. – Arty May 30 '21 at 14:11
@Arty optimizations without measured improvements are just code obfuscation ;-) – thebjorn May 31 '21 at 10:06
@Vicrobot recursion in Python uses memory for each iteration. Some languages implement optimizations for calls in tail position, where the same stack-frame is re-used for the recursive call (i.e. it effectively turns it into a loop). Guido doesn't want tail-call-optimizations in Python (http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html) so it is unlikely to change. – thebjorn May 31 '21 at 10:22

thebjorn · Answer 2 · 2021-05-29T19:02:39.293

1

I would write it as:

def collatz(n):
    tmp = -17 - 2**4096            # precompute constant value outside loop
    while 1:                       # in general, iteration is better than recursion in Python for these kinds of functions
        if n in (-1, 1, -17, tmp): # less typing than lots of or clauses
            return n               # return values rather than printing them
        elif n % 2 == 1:           # faster than converting to string and checking last digit
            n = 3*n + 1
        else:
            n //= 2

call it with e.g.:

print(collatz(2**4000 + 1))

performance: on my PC, collatz(2**5000-1) with the above code takes 0.069 secs. Changing the code to elif str(n)[-1] in '13579': makes it take 1.606 secs, i.e. 23x slower.

On collatz(2**40000-1), the code above used 3.822 secs. Changing

elif n % 2 == 1:

to either

elif n % 2:

or

elif n & 1:

reduced the time to 1.988 secs (i.e. no difference).

Changing n //= 2 to n >>= 1 reduced the time further to 1.506 secs.

edited May 29 '21 at 19:02

answered May 29 '21 at 18:44

thebjorn

26,297
11
96
138

BTW, not sure if Python does division optimizations, so it is possible to write `n % 2` as `n & 1` and `n //= 2` as `n >>= 1`. – Arty May 29 '21 at 18:47
Python does not :-) – thebjorn May 29 '21 at 18:48
So then to make code really optimal better to write `n % 2` as `n & 1` and `n //= 2` as `n >>= 1`. Just a note, no need to modify your code. – Arty May 29 '21 at 18:49
@Arty I've added performance comparisons. The `str` removal had the most effect, but all the other suggestions had measurable performance benefits, at least on the single number I tested with ;-) – thebjorn May 29 '21 at 19:04
Another tiny optimization: `3n+1` is even => next step will be `//2`, you can combine those two steps into one `(3n+1)//2` i.e. `n+(n+1)//2`. Or simply omit the `else:` to do the division every time. – VPfB May 29 '21 at 19:49

Segmentation Fault Error while doing big calculations in python

2 Answers2