Best way to store large base B numbers?

Question

What is the best way to store large base B numbers so that the operations like right shift and checking the least significant bit can be done efficiently?

Actually, I have came across an interview question which says that

Given two numbers N and K such that 0<=N<=1000 and 0<=K<=1000^1000. I need 
to check that whether N^N = K or not. For example:

if N=2 and K=4 , 2^2 = 4 so return true;
if N=3 and K=26 , 3^3 != 26 so return false

What I was thinking is that if I consider base N number system, then N^N will be equivalent to 1 followed by N zero in it. For e.g - For N = 2, 2^2 = 100(in base 2), for N=3, 3^3 = 1000(in base 3). I can then easily write a function to tell whether K = N^N or not.

int check(unsigned long int N, unsigned long int K) 
{
    unsigned long int i;
    for (i=0; i<N; i++) 
    {
        if (K%N != 0) //checking whether LSB is 0 or not in base N number system 
           return 0;
        K = K/N; //right shift K by one. 
    }
    return (K==1);
}

Now there are two major problems with this function:

1) An unsigned long int is not sufficient to store large numbers of range 0 
to 1000^1000.
2) The modulus and the division operations makes it every less efficient.

In order to make it efficient, I am looking for some way to represent large base N numbers so that I can perform right shift and checking the least significant bit operations efficiently. Does anyone came across such thing before? Or anyone know any other way to solve this problem efficiently?

In what form are the numbers originally given? If you want to represent `K` in base `N`, generally, you have to do the division, so that won't buy you anything over dividing and checking. — Daniel Fischer, Jun 13 '12 at 18:01
What Daniel says: even if you have such a data structure, converting `K` to a base `N` representation in order to plug it into the data structure is pretty much the whole problem you face. It's equivalent to all those divisions and moduluses. — Steve Jessop, Jun 13 '12 at 18:07
@DanielFischer: Actually, I read this question on internet http://www.careercup.com/question?id=13880754 where no info is given regarding the original numbers except the range. Then I guess you can assume it as per yours convenience which helps you to solve the problem efficiently. — Ravi Gupta, Jun 13 '12 at 18:08
@Ravi: I think realistically you have to assume that inputs in such problems are in base 2 or base 10. It's "cheating" to require that `K` be input in base `N`, because it makes the problem too easy (or rather, it shifts the problem to the caller of your function, such that your function is not as useful as the one the interviewer was hoping you'd write). — Steve Jessop, Jun 13 '12 at 18:11

score 2 · Answer 1 · answered Jun 13 '12 at 18:16

2

To check equality you don't actually have to do high precision arithmetic - you could use http://en.wikipedia.org/wiki/Chinese_remainder_theorem. Find enough primes to ensure that their product is greater than N^N and then check N^N against K modulo each of the primes in turn.

In practice, I'd probably use the Java BigInteger package to make the naive calculation.

answered Jun 13 '12 at 18:16

mcdowella

19,301
2
19
25

`s/Java BigInteger/gmp/`, since the question says C. – Steve Jessop Jun 13 '12 at 18:23

Steve Jessop · Answer 2 · 2012-06-13T18:48:42.947

Depending on the interviewer, there are a few answers that might be accepted. And if any of these isn't accepted, then hopefully the interviewer will swiftly move you on to suggest something else.

Use gmp, and either do your divisions and moduluses with mpz_t instead of unsigned long or else just calculate N^N and compare it with K. This is the simplest thing that works.
Write your own big number library, perhaps using base 10 as the internal representation rather than base 2, if the inputs are in base 10 and it's thought that sticking with that might be faster than converting to binary first. Of course it needn't be a complete arithmetic library, all you really need is division-with-remainder.
Invent some fast tests that you can do before starting, to avoid a lot of work when K is nowhere near N^N in some respect. For example, test whether log(K) / N is approximately equal to log(N), with the log taken in whatever base is most convenient for the input. Or test how many times K and N are divisible by convenient numbers like 2 or 10: if K isn't divisible exactly N times as many times as N, then it's obviously wrong. Or test whether they're equal modulo some small number like ULONG_MAX or 1000000. Unfortunately, this kind of thing only speeds up certain cases where they're not equal, it slows down everything else including cases where they are. So it may be counter-productive, depending what inputs you're expecting.

mcdowella's answer may or may not be best, I don't know. It's especially promising that you only need to generate the primes once (when you write the program), and 1000 primes starting from 1009 is more than sufficient given N <= 1000. Bigger primes mean fewer needed and hence less work, especially if they don't get bigger than the square root of ULONG_MAX. Use exponentiation by squaring or equivalent to get N^N modulo each prime, and for K do a few digits at a time in whatever base the input is.

To be really flash, for each of your pre-chosen primes p you can write (or let the compiler write for you) a modulo-p operation that's faster than a general integer modulus operation that works for any divisor. That is to say, with a decent compiler i % 1009 might be faster than i % j where the value of j is unknown at compile time, but turns out at runtime to be 1009. But beware, the difference in speed may not justify the cost of (say) calling it through a function pointer. So taking advantage of it might require some ugly-looking repetitive code.

sukunrt · Answer 3 · 2012-06-13T18:30:48.667

0

Why do you want to convert the numbers to base N?

You can keep on dividing by N. If you get 1 after N such divisons then it's N^N. else it isn't.

You will have to store K as a string and implement a divison operation.

divide(k,n):
  c = ''
  a = ''
  for i in k:
    c = c+i
    a = a+ str(int(c)/ int(n))
    c = str(int(c) % int(n))
 return a

this is in python, you can implement something similar in C

edited Jun 13 '12 at 18:30

answered Jun 13 '12 at 18:11

sukunrt

1,523
10
20

score 0 · Answer 4 · answered Jun 14 '12 at 08:55

Given two numbers N and K such that 0<=N<=1000 and 0<=K<=1000^1000. I need to check that whether N^N = K or not.

A lot depends on how those numbers are stored when provided to your code. Assuming they're in text format, I'd just keep them that way, and create an array of 1001 strings storing the N^N values. You could use an arbitrary precision arithmetic command line program like bc for a one-off creation of those strings, calling it in a loop.

score 0 · Answer 5 · answered Jun 14 '12 at 10:39

Since there will be only 1000 "good" values out of 1000^1000 (and they will be very far one from another), you could use some logarithm aproximation first to have a guess for N. After that, you'll need at most one exact check (with some bignum library, for example).

This logarithm does not have to be exact, even strlen() can be close enough approximation.

Best way to store large base B numbers?

5 Answers5