-4

I am trying to implement a code which works for retrieving Millionth Fibonacci number or beyond. I am using Matrix Multiplication with Numpy for faster calculations.

According to my understanding it should take O(logN) time and worst case for a million should result in: Nearly 6secs which should be alright.

Following is my implementation:

def fib(n):
    import numpy as np
    matrix = np.matrix([[1, 1], [1, 0]]) ** abs(n)
    if n%2 == 0 and n < 0:
        return -matrix[0,1]
    return matrix[0, 1]

However, leave 1million, it is not even generating correct response for 1000

Actual response:

43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

My Response:

817770325994397771

Why is Python truncating the responses generally from the docs it should be capable of calulating even 10**1000 values. Where did I go wrong?

  • If performance really matters that much, maybe you could count values directly https://en.wikipedia.org/wiki/Fibonacci_number#:~:text=Closed%2Dform%20expression%5Bedit,the%20same%20time%3A%5B23%5D – Peter Trencansky Dec 31 '21 at 13:59
  • 2
    Data of numpy arrays are limitted by [machine limits for integer types](https://numpy.org/doc/stable/reference/generated/numpy.iinfo.html). [Here you can find](https://stackoverflow.com/a/70542476/3044825) a related example about this issue. – mathfux Dec 31 '21 at 14:01
  • @mathfux Only until you tell it not to do that. – Kelly Bundy Dec 31 '21 at 14:54
  • Can you explain how this works? Or why `numpy` should help? Or did you get this approach from some where else? – hpaulj Dec 31 '21 at 16:45
  • @hpaulj NumPy's matrix exponentiation helps make the code clean/short... – Kelly Bundy Dec 31 '21 at 16:59
  • @KellyBundy, it may be short from the end user point of view, but it hides a lot of implementation details, – hpaulj Dec 31 '21 at 20:33
  • 1
    @hpaulj You make it sound like that's a bad thing. – Kelly Bundy Dec 31 '21 at 20:40

3 Answers3

3

Numpy can handle numbers and calculations in a high-performance way (both memory efficiency and computing time). So while Python can process sufficiently large numbers, Numpy can't. You can let Python do the calculation and get the result, exchanging with performance reduction.

Sample code:

import numpy as np

def fib(n):
    # the difference is dtype=object, it will let python do the calculation
    matrix = np.matrix([[1, 1], [1, 0]], dtype=object) ** abs(n)
    if n%2 == 0 and n < 0:
        return -matrix[0,1]
    return matrix[0, 1]

print(fib(1000))

Output:

43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

PS: Warning

Milionth Fibonacci number is extremely large, you should make sure that Python can handle it. If not, you will have to implement/find some module to handle these large numbers.

tandat
  • 255
  • 1
  • 9
  • @KellyBundy Because "Milionth Fibonacci number is extremely large", Python can handle larger numbers than Numpy, but it still has its own limit. – tandat Dec 31 '21 at 16:33
  • I have researched a little bit and realize that the limit has been removed from Python 3. I will edit my answer. – tandat Dec 31 '21 at 16:41
  • What *was* the limit before it was removed? – Kelly Bundy Dec 31 '21 at 16:42
  • 9223372036854775807 in Python 2. Got it from Google – tandat Dec 31 '21 at 16:44
  • 1
    Wasn't that just the limit for `int`, and when a calculation exceeded that, it simply automatically changed to `long` and still worked perfectly fine? – Kelly Bundy Dec 31 '21 at 16:45
  • I think you was right. Learned new thing today. I left it as a warning because I am not sure about that. I thought it should have some limit like other languages. Thank you for clarifying my answer. – tandat Dec 31 '21 at 16:48
1

I'm not convinced that numpy is much help here as it does not directly support Python's very large integers in vectorized operations. A basic Python implementation of an O(logN) algorithm gets the 1 millionth Fibonacci number in 0.15 sec on my laptop. An iterative (slow) approach gets it in 12 seconds.:

def slowfibo(N):
    a = 0
    b = 1
    for _ in range(1,N): a,b = b,a+b
    return a

# Nth Fibonacci number (exponential iterations) O(log(N)) time (N>=0)
def fastFibo(N):
    a,b   = 1,1
    f0,f1 = 0,1
    r,s   = (1,1) if N&1 else (0,1)
    N   //=2
    while N > 0:
        a,b   = f0*a+f1*b, f0*b+f1*(a+b)
        f0,f1 = b-a,a
        if N&1: r,s = f0*r+f1*s, f0*s+f1*(r+s)
        N //= 2        
    return r 

output:

f1K = slowFibo(1000) # 0.00009 sec 
f1K = fib(1000)      # 0.00011 sec (tandat's)
f1K = fastFibo(1000) # 0.00002 sec

43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875


f1M = slowFibo(1_000_000)  # 12.52 sec
f1M = fib(1_000_000)       # 0.2769 sec (tandat's)
f1M = fastFibo(1_000_000)  # 0.14718 sec

19532821287077577316...68996526838242546875

len(str(f1M))  # 208988 digits
Alain T.
  • 40,517
  • 4
  • 31
  • 51
1

The core of your function is

np.matrix([[1, 1], [1, 0]]) ** abs(n)

which is discussed in the Wiki article

enter image description here

np.matrix implements ** as __pwr__, which in turn uses np.linalg.matrix_power. Essentially that's a repeated dot matrix multiplication, with a modest enhancement by grouping the products by powers of 2.

In [319]: M=np.matrix([[1, 1], [1, 0]])
In [320]: M**10
Out[320]: 
matrix([[89, 55],
        [55, 34]])

The use of np.matrix is discouraged, so I can do the same with

In [321]: A = np.array(M)
In [322]: A
Out[322]: 
array([[1, 1],
       [1, 0]])
In [323]: np.linalg.matrix_power(A,10)
Out[323]: 
array([[89, 55],
       [55, 34]])

Using the (newish) @ matrix multiplication operator, that's the same as:

In [324]: A@A@A@A@A@A@A@A@A@A
Out[324]: 
array([[89, 55],
       [55, 34]])

matrix_power does something more like:

In [325]: A2=A@A; A4=A2@A2; A8=A4@A4; A8@A2
Out[325]: 
array([[89, 55],
       [55, 34]])

And some comparative times:

In [326]: timeit np.linalg.matrix_power(A,10)
16.2 µs ± 58.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [327]: timeit M**10
33.5 µs ± 38.8 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [328]: timeit A@A@A@A@A@A@A@A@A@A
25.6 µs ± 914 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [329]: timeit A2=A@A; A4=A2@A2; A8=A4@A4; A8@A2
10.2 µs ± 97.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

numpy integers are implemented at int64, in c, so are limited in size. Thus we get overflow with a modest 100:

In [330]: np.linalg.matrix_power(A,100)
Out[330]: 
array([[ 1298777728820984005,  3736710778780434371],
       [ 3736710778780434371, -2437933049959450366]])

We can get around this by changing the dtype to object. The values are then Python ints, and can grow indefinately:

In [331]: Ao = A.astype(object)
In [332]: Ao
Out[332]: 
array([[1, 1],
       [1, 0]], dtype=object)

Fortunately matrix_power can cleanly handle object dtype:

In [333]: np.linalg.matrix_power(Ao,100)
Out[333]: 
array([[573147844013817084101, 354224848179261915075],
       [354224848179261915075, 218922995834555169026]], dtype=object)

Usually math on object dtype is slower, but not in this case:

In [334]: timeit np.linalg.matrix_power(Ao,10)
14.9 µs ± 198 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

I'm guessing it's because of the small (2,2) size of the array, where fast compiled methods aren't useful. This is basically an iterative task, where numpy doesn't have any advantages.

Scaling isn't bad - increase n by 10, and only get a 3-4x increase in time.

In [337]: np.linalg.matrix_power(Ao,1000)
Out[337]: 
array([[70330367711422815821835254877183549770181269836358732742604905087154537118196933579742249494562611733487750449241765991088186363265450223647106012053374121273867339111198139373125598767690091902245245323403501,
        43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875],
       [43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875,
        26863810024485359386146727202142923967616609318986952340123175997617981700247881689338369654483356564191827856161443356312976673642210350324634850410377680367334151172899169723197082763985615764450078474174626]],
      dtype=object)
In [338]: timeit np.linalg.matrix_power(Ao,1000)
53.8 µs ± 83 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

With object dtype np.matrix:

In [340]: Mo = M.astype(object)
In [344]: timeit Mo**1000
86.1 µs ± 164 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

And for million, the times aren't as bad as I anticipated:

In [352]: timeit np.linalg.matrix_power(Ao,1_000_000)
423 ms ± 1.92 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

For comparison, the fastFibo times on my machine are:

In [354]: fastFibo(100)
Out[354]: 354224848179261915075
In [355]: timeit fastFibo(100)
3.91 µs ± 154 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [356]: timeit fastFibo(1000)
9.37 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [357]: timeit fastFibo(1_000_000)
226 ms ± 12.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
hpaulj
  • 221,503
  • 14
  • 230
  • 353