2

Very perplexed with this one. I have three implementations of an algorithm to calculate the factorial of a number. I calculated the average runtimes of each for input size up to 2500 and plotted them. From the visual inspection it seems that they don't exhibit linear time complexity but rather quadratic. To explore this further, I used curve fitting and the results emerging from visual inspection are confirmed.

Why is this happening? Is it maybe related to the way multiplication is handled in Python for small number? (see here Complexity of recursive factorial program)

import matplotlib.pyplot as plt
import numpy as np
import random
from scipy.optimize import curve_fit
import statistics as st
import timeit
import sys

sys.setrecursionlimit(3000)

def factorial_iterative(n):
    ''' Function that calculates the factorial of its argument using iteration
    
    Assumes that its argument is a positive integer
    Uses iteration
    Returns the factorial of the argument
    '''
    factorial = n
    while n > 1:
        n -= 1
        factorial *= n
    return factorial

def factorial_recursive_non_tail(n):
    ''' Function that calculates the factorial of its argument using non-tail recursion
    
    Assumes that its argument is a positive integer
    Uses non-tail recursion
    Returns the factorial of the argument
    '''
    if n == 1:
        return 1
    else:
        return n * factorial_recursive_non_tail(n - 1) 

def factorial_recursive_tail(n, factorial):
    ''' Function that calculates the factorial of its argument using tail recursion
    
    Assumes that its first argument is a positive integer
    Assumes that its second argument is an accumulator with a value of 1
    Uses tail recursion
    Returns the factorial of the argument
    '''
    if n == 1:
        return factorial
    else:
        return factorial_recursive_tail(n-1, n*factorial)

# max input size
n = 2500

# create input values list and initialise lists to store runtimes
n_values = list(range(1, n+1))
fact_iterative_runtimes_list = []
fact_non_tail_runtimes_list = []
fact_tail_runtimes_list = []

# for each n, time the implementation 100 times, calculate avg runtime, and store in dedicated list
for i in n_values:
    # iterative
    fact_iter_runtime = timeit.repeat(lambda: factorial_iterative(i), number=1, repeat=100)
    fact_iterative_runtimes_list.append(st.mean(fact_iter_runtime))

    # non-tail recursive
    fact_recursive_non_tail_runtime = timeit.repeat(lambda: factorial_recursive_non_tail(i), number=1, repeat=100)
    fact_non_tail_runtimes_list.append(st.mean(fact_recursive_non_tail_runtime))

    # tail recursive
    fact_recursive_tail_runtime = timeit.repeat(lambda: factorial_recursive_tail(i, 1), number=1, repeat=100)
    fact_tail_runtimes_list.append(st.mean(fact_recursive_tail_runtime))

# Plot avg runtimes against input sizes
plt.figure(figsize=(18, 6))
plt.plot(n_values, fact_iterative_runtimes_list, label = "Iterative")
plt.plot(n_values, fact_non_tail_runtimes_list, label = "Recursive non-tail")
plt.plot(n_values, fact_tail_runtimes_list, label = "Recursive tail")
plt.ylabel("Running time (seconds)")
plt.xlabel("Values of n")
plt.legend()
plt.show()
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
beyond_sci
  • 25
  • 4
  • 2
    I'm not sure the matplotlib tag applies here since the question just uses it to plot but the question is not actually related to matplotlib – aaossa Mar 15 '22 at 12:38
  • 7
    If anything it’d be related to the way multiplication in Python is handled for *large* numbers, not small ones. Since Python has arbitrary precision integers, the cost of multiplication is no longer constant. – Konrad Rudolph Mar 15 '22 at 12:38
  • @luk2302 I'm aware, that's the expectation. The problem is what's expected is not actually the reality. – beyond_sci Mar 15 '22 at 12:41
  • @KonradRudolph so would you say the way multiplication is handled could be one of the reasons? – beyond_sci Mar 15 '22 at 12:42
  • Instead of plotting runtime, which could be a factor of so many things (available machine resources like load factor and CPU, how multiplication is handled in Python etc.) why don't you have a static counter to check number of iterations for multiple values of `n` and see how that grows. – vish4071 Mar 15 '22 at 13:59
  • There's no way to calculate exact integer "factorials up to 2500" on today's computers that is going to be linear (we'd need cpu's with instruction and data paths millions of bits wide). You can go up to about 20! on most modern CPUs before you have to revert to basic math operations that are themselves at least O(b). – RBarryYoung Mar 15 '22 at 15:07
  • 1
    @vish4071: this is of no use. The number of multiplies is already known to be linear. In fact, exactly n-1. –  Mar 15 '22 at 15:54

1 Answers1

3

As @Konrad has pointed out, it is due to the way multiplication is handled in Python.

For smaller numbers, simple school level multiplication (which runs in O(N^2)) is used. However, for bigger numbers, it uses the Karatsuba Algorithm, which has a estimated complexity of O(N^1.58) (N = length of the number). Since the multiplication isn't achieved in O(1), your time complexity isn't linear.

There are "faster" multiplication algorithms (such as Toom-Cook and Schönhage-Strassen) if you want to look into it.

Abhinav Mathur
  • 7,791
  • 3
  • 10
  • 24
  • There is some confusion in this answer. The question calls `n` the input number. The "school level multiplication" and the Karatsuba algorithm have complexities O(k^2) and O(k^1.58) respectively, where k is **the length of the input**, not the input number. Thus k = log_10(n), and k^2 is **much** smaller than n^2. – Stef Mar 15 '22 at 15:30
  • 1
    @Stef good point, added that – Abhinav Mathur Mar 15 '22 at 15:49
  • Also, you can trivially multiply a bignum by a smallnum in time linear to the length of the bignum, and Python does that. No sophisticated algorithm is needed, not even Karatsuba, and no sophisticated algorithm is going to help. (That assumes the product of two smallnums takes constant time, which is does in practice.) – rici Mar 16 '22 at 02:14