0

I'm implementing Karatsuba multiplication in Scala (my choice) for an online course. Considering the algorithm is meant to multiply large numbers, I chose the BigInt type which is backed by Java BigInteger. I'd like to implement the algorithm efficiently, which using base 10 arithmetic is copied below from Wikipedia:

procedure karatsuba(num1, num2)
  if (num1 < 10) or (num2 < 10)
    return num1*num2
  /* calculates the size of the numbers */
  m = max(size_base10(num1), size_base10(num2))
  m2 = floor(m/2)
  /* split the digit sequences in the middle */
  high1, low1 = split_at(num1, m2)
  high2, low2 = split_at(num2, m2)
  /* 3 calls made to numbers approximately half the size */
  z0 = karatsuba(low1, low2)
  z1 = karatsuba((low1 + high1), (low2 + high2))
  z2 = karatsuba(high1, high2)
  return (z2 * 10 ^ (m2 * 2)) + ((z1 - z2 - z0) * 10 ^ m2) + z0

Given that BigInteger is internally represented as an int[], if I can calculate m2 in terms of the int[], I can use bit shifting to extract the lower and higher halves of the number. Similarly, the last step can be achieved by bit shifting too.

However, it's easier said than done, as I can't seem to wrap my head around the logic. For example, if the max number is 999, the binary representation is 1111100111, lower half is 99 = 1100011, upper half is 9 = 1001. How do I get the above split?

Note: There is an existing question that shows how to implement using arithmetic on BigInteger, but not bit shifting. Hence, my question is not a duplicate.

Abhijit Sarkar
  • 21,927
  • 20
  • 110
  • 219
  • To use bit shifting, of course the base must be a power of two. The last answer of the linked question does use bit shifting. – harold Nov 04 '18 at 20:38
  • @harold "last answer" is vague, perhaps you're referring to [this](https://stackoverflow.com/a/40126296/839733) answer, which happens to be a copy-paste from somewhere else with no explanation. I've no confidence in such answers, the only thing it shows is that the author has an internet connection. – Abhijit Sarkar Nov 04 '18 at 20:41
  • That may be, but it definitely shows a way to split the big integer, which seems to be the focus of this question – harold Nov 04 '18 at 20:48
  • @harold Question is, does it do that correctly? Using the formula from that answer `N = (N / 2) + (N % 2)`, and my example of 999, we get `N = 6, low = 100111 = 39`. How is that equal to the desired 99? – Abhijit Sarkar Nov 04 '18 at 20:50
  • You cannot use bit shifts to split 999 into 9 and 99, the base must be a power of two. So basically, stop desiring 99, you can't efficiently split it that way, but you can split 1111100111 down the middle. – harold Nov 04 '18 at 20:55
  • @harold that makes sense, I was mixing up the 2 bases. One last thing: why cutoff at 2000? If you care to post your comments as an answer to my question, I'll be happy to accept it. – Abhijit Sarkar Nov 04 '18 at 21:06

1 Answers1

1

To be able to use bit shifting to do the splits and recombination, the base needs to be a power of two. Using two itself, as in the linked answer, is probably reasonable. Then the "length" of the inputs can be found directly with bitLength, and the split could be implemented as:

// x = a + 2^N b
BigInteger b = x.shiftRight(N);
BigInteger a = x.subtract(b.shiftLeft(N));

Where N is the size that a will have in bits.

Given that BigInteger is implemented with 32bit limbs, it makes sense to use 2³² as the base, ensuring that the big shifts involve only the movement of whole integers, and not also the slower code path where the BigInteger is shifted by a value between 1 and 31. This could be accomplished by rounding N to a multiple of 32.

The specific constant in this line,

if (N <= 2000) return x.multiply(y);                // optimize this parameter

Should probably not be trusted too much, given that comment. For performance there should be some bound though, otherwise the recursive splitting goes too deeply. For example, when the size of the numbers is 32 or less, it's clearly better to just multiply, but probably a good cut-off is much higher. In this source of BigInteger itself, the cutoff is expressed in terms of the number of limbs instead of bits, and set to 80 (so 2560 bits) - it also has an other threshold above which it switches to 3-way Toom-Cook multiplication instead of Karatsuba multiplication.

harold
  • 61,398
  • 6
  • 86
  • 164
  • I added some different things that weren't in the comments yet so hopefully this is more useful – harold Nov 04 '18 at 21:25
  • Looks like the optimization was introduced in Java 8; [7 src code](https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/math/BigInteger.java#L1165) doesn't do any of that. I'm finding the JDK implementation interesting. – Abhijit Sarkar Nov 04 '18 at 21:57