3

In coding up a simple Fibonacci script, I found some 'odd' behaviour in how Python treats numpy.int32 vs how it treats regular int numbers.

Can anyone help me understand what causes this behaviour?

Using the Fibonacci code as follows, leveraging caching to significantly speed things up;

from functools import lru_cache
import numpy as np

@lru_cache(maxsize=None)
def fibo(n):
    if n <= 1:
        return n
    else:
        return fibo(n-1)+fibo(n-2)

If I define a Numpy array of numbers to calculate over (with np.arange), it all works well until n = 47, then things start going haywire. If, on the other hand, I use a regular python list, then the values are all correctly calculated

You should be able to see the difference with the following;

fibo(np.int32(47)), fibo(47)

Which should return (at least it does for me);

(-1323752223, 2971215073)

Obviously, something very wrong has occured with the calculations against the numpy.int32 input. Now, I can get around the issue by simply inserting a 'n = int(n)' line in the fibo function before anything else is evaluated, but I dont understand why this is necessary.

I've also tried np.int(47) instead of np.int32(47), and found that the former works just fine. However, using np.arange to create the array seems to default to np.int32 data type.

I've tried removing the caching (I wouldn't recommend you try - it takes around 2 hours to calculate to n = 47) - and I get the same behaviour, so that is not the cause.

Can anyone shed some insight into this for me?

Thanks

Mark Burgoyne
  • 1,524
  • 4
  • 18
  • 31
  • 1
    Python `int` doesn't have an upper bound. `int32` overflows relatively soon, `int64` later. Why use numpy at all? You could just as well start with `list(range(...))`. I don't see where you use `arange`. – hpaulj Nov 12 '21 at 01:36
  • On my setup `fibo(np.int32(47))` produced the result of your `fibo(47)`. What versions of Python/numpy are you using? – 0x263A Nov 12 '21 at 01:37
  • I'm using Python 3.7.7 with Numpy 1.19.5. I did also try with numpy.int64, and had the same issue. I suspect that hpaulj has highlighted the root cause of this. As to why I'm using Numpy, I'm not even using Fibonacci - just playing around and learning. How can we create an array of int's from np.arange without it reverting to nump.int32 I wonder? – Mark Burgoyne Nov 12 '21 at 01:42
  • 1
    I think @0x263a missed the point. Calling `fibo(np.int32(47))` will work just fine, because Python will convert the numpy integer to a normal integer as soon as you perform normal arithmetic on it. However if you write code that takes two numpy ints, say, from an array, adds them and stores them back into a numpy array, the truncation will happen. – Frank Yellin Nov 12 '21 at 02:25

1 Answers1

4

Python's "integers have unlimited precision". This was built into the language so that new users have "one less thing to learn".

Though maybe not in your case, or for anyone using NumPy. That library is designed to make computations as fast as possible. It therefore uses data types that are well supported by the CPU architecture, such as 32-bit and 64-bit integers that neatly fit into a CPU register and have an invariable memory footprint.

But then we're back to dealing with overflow problems like in any other programming language. NumPy does warn about that though:

>>> print(fibo(np.int32(47)))
fib.py:9: RuntimeWarning: overflow encountered in long_scalars
  return fibo(n-1)+fibo(n-2)
-1323752223

Here we are using a signed 32-bit integer. The largest positive number it can hold is 231 - 1 = 2147483647. But the 47th Fibonacci number is even larger than that, it's 2971215073 as you calculated. In that case, the 32-bit integer overflows and we end up with -1323752223, which is its two's complement:

>>> 2971215073 + 1323752223 == 2**32
True

It worked with np.int because that's just an alias of the built-in int, so it returns a Python integer:

>>> np.int is int
True

For more on this, see: What is the difference between native int type and the numpy.int types?

Also note that np.arange for integer arguments returns an integer array of type np.int_ (with a trailing underscore, unlike np.int). That data type is platform-dependent and maps to 32-bit integers on Windows, but 64-bit on Linux.

john-hen
  • 4,410
  • 2
  • 23
  • 40