Karatsuba algorithm working for small numbers but not for big ones, can't see why

Question

I am relatively new to programming and am not looking to be particularly efficient with this algorithm regarding running time but only trying to replicate the Karatsuba algorithm and make it work.

I have tried it with many numbers and small numbers (like y = 40004009343254, x = 40004001343234) work fine and when the numbers increase in size (like y = 4000400934325423423, x = 4000400134323432423), the algorithm stops working correctly and returns similar but incorrect answers.

Any clue about what could be wrong would be very much appreciated!

NOTE: This thread is not about efficiency but about getting the correct result. That said, comments about efficiency will be taken into account and appreciated too.

CODE:

y = 4000400934325423423
x = 4000400134323432423

def firsthalf(array):
    firsthalf = array[:len(array)/2]
    return firsthalf
def secondhalf(array):
    secondhalf = array[len(array)/2:]
    return secondhalf
def arrayjoint(array):
    jointarray = long(''.join(map(str,array)))
    return jointarray
def karatsuba(x,y):
    if len(str(x)) == 0 or len(str(y)) == 0:
        return "Can't multiply by a NULL value!"
    if x < 10 or y < 10:
        return x * y
    x_array = [long(i) for i in str(x)]
    y_array = [long(i) for i in str(y)]
    firsthalf_xarray = firsthalf(x_array)
    secondhalf_xarray = secondhalf(x_array)
    firsthalf_yarray = firsthalf(y_array)
    secondhalf_yarray = secondhalf(y_array)
    half_size = max(len(secondhalf_yarray), len(secondhalf_xarray))
    firsthalf_x = arrayjoint(firsthalf_xarray)
    secondhalf_x = arrayjoint(secondhalf_xarray)
    firsthalf_y = arrayjoint(firsthalf_yarray)
    secondhalf_y = arrayjoint(secondhalf_yarray)
    sum_x = firsthalf_x + secondhalf_x
    sum_y = firsthalf_y + secondhalf_y
    first = karatsuba(firsthalf_x,firsthalf_y)
    second = karatsuba(sum_x, sum_y)
    third = karatsuba(secondhalf_x,secondhalf_y)
    return first * 10 ** (2 * half_size) + ((second - first - third) * (10 ** half_size)) + third

result = karatsuba(x,y)
result_correct = x*y
result = str(result)
result_correct = str(result_correct)
file = open("result.txt", "w")
file.write(str(result)  + "\n" + str(result_correct))
file.close

*Similar but incorrect answers* suggests that you are surpassing the maximum value of long ints and thus step into the world of [floating point numbers](https://en.wikipedia.org/wiki/Floating_point). Which are known to be accurate only up to a fixed number of decimals. (I'd also object to your calling `40004009343254` a "small number". If it were meters, that number would still be equal to more than 100,000 times the distance to the moon.) — Jongware, Dec 09 '16 at 15:27
@RadLexus According to [python docs](https://docs.python.org/3.5/library/stdtypes.html#numeric-types-int-float-complex), python integers have unlimited precision. (same with 2.7 long integers) — Christian W., Dec 09 '16 at 15:36

David Eisenstat · Accepted Answer · 2016-12-09T15:56:48.553

It's not an issue with floats, because Python has bignums.

The issue is that, when the inputs have disparate lengths, you split them in different places, which defeats the algebra underlying Karatsuba's algorithm. By splitting at index -half_size (i.e., the second half has half_size digits), we ensure that 10**half_size is the proper base. Try this:

def digits_to_long(x_array):
    return long(''.join(x_array)) if x_array else 0L


def karatsuba(x, y):
    if x < 10 or y < 10:
        return x * y
    x_array = str(x)
    y_array = str(y)
    half_size = max(len(x_array), len(y_array)) // 2
    firsthalf_x = digits_to_long(x_array[:-half_size])
    secondhalf_x = digits_to_long(x_array[-half_size:])
    firsthalf_y = digits_to_long(y_array[:-half_size])
    secondhalf_y = digits_to_long(y_array[-half_size:])
    sum_x = firsthalf_x + secondhalf_x
    sum_y = firsthalf_y + secondhalf_y
    first = karatsuba(firsthalf_x, firsthalf_y)
    second = karatsuba(sum_x, sum_y)
    third = karatsuba(secondhalf_x, secondhalf_y)
    return first * 10**(2 * half_size) + (
        (second - first - third) * (10**half_size)) + third


import random
for i in range(10000):
    x = random.randrange(10**18)
    y = random.randrange(10**18)
    assert karatsuba(x, y) == x * y

Ah yes, I see this note on `pow`: "Unlike the built-in `**` operator, `math.pow()` converts both its arguments to type float." — Jongware, Dec 09 '16 at 15:53

Karatsuba algorithm working for small numbers but not for big ones, can't see why

1 Answers1

Linked