7

I have the length of a representation of an integer in an arbitrary base. Say the length is 15, and the base is 36. I'd then like to work out how long a representation of said integer would be in another arbitrary base. i.e, converting to base 2 might result in a length of 68.

I know it's along the lines of the below, but I can't quite get my head around what I need to floor and ceil, and I'm getting some results that are way off:

length * log(fromBase) / log(toBase)
Max
  • 2,760
  • 1
  • 28
  • 47
  • 1
    Basically, the length in digits is the log(baseN) + 1, truncated to an integer. – Hot Licks Jan 23 '13 at 03:16
  • @HotLicks Any way to do it without calculating the log of the actual number? The number is generally too large to do that trivially (think hundreds of digits in base 10). What I'm trying to do is calculate the size of the buffer into which to put the number when I later convert it. – Max Jan 23 '13 at 05:22
  • @Alec: You can easily ballpark the log (base 10) of the number by adding the log of the first few digits to the number of digits, minus one. For instance, log(42598) is ~4.629. Or, you could calculate log(4.26) = 0.629, add 5 for the number of digits, and subtract one, gives you 4.629. – Bob Murphy Jan 23 '13 at 06:11
  • You can make a table, of course. Or you can divide out the number. – Hot Licks Jan 23 '13 at 12:18
  • @BobMurphy - Note that computing the log from some subset of digits is inadequate in general to get the exact value. Thus, 2^100-1 requires 100 base 2 digits. But 2^100 requires 101 base 2 digits. Representing 2^100 in base 10, and we have 1267650600228229401496703205376, a number large enough that a double precision log will fail to see the difference. The difference between the two log10 values will be on the order of 3.4e-31, so it will be unseeable using a standard log function. –  Jan 23 '13 at 15:32

3 Answers3

10

Following a Mathematica-like syntax, let

Log[b,n]

represent the logarithm to base b of n. Let Log[n] represent the natural logarithm of n.

Then the ratio

Log[b1,n]/Log[b2,n]

is constant, and equal to

Log[b2]/Log[b1]

This ratio is a multiplier for calculating the number of digits in base b1 from the number of digits in base b2 (or vice-versa if you see things that way). For the example in the question, a 15-digit base-36 number will need

15*Log[36]/Log[2] == 77.5489

base-2 digits. This is, of course, precisely what you have in your question. You only need to round the final answer up to the next integer.

I'm not sure, of course, why you seem to be getting some results that are way off.

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
  • 1
    This is a close, but not perfect solution. For the purpose of caclating sufficient size, it may be ok (but not if you need an exact size). – Rick Love Nov 16 '18 at 23:29
4

Sadly, there is no exact solution without computing in high precision. For example, (I'll use MATLAB for my work, including tools for high precision work I've written myself) what is 2^200? In base 10, we get:

vpij(2)^200
ans =
    1606938044258990275541962092341162602522202993782792835301376

That number is represented in binary using 201 base 2 digits. However, 2^200-1 only needs 200 base 2 digits to represent.

vpij(2)^200 - 1
ans =
    1606938044258990275541962092341162602522202993782792835301375

Now, we could compute the log of these numbers, as a double, by taking only the highest order digits. We need to add 1 to the base 2 log of a number to know the number of base 2 digits are needed to represent it.

format long g
1 + log2(vpij(2)^200)
ans =
   201

1 + log2(vpij(2)^200 - 1)
ans =
   201

Here log2 did exactly that, taking the top decimal digits to compute that log. See that it cannot tell that the second number really requires one less digit to store in binary form.

vpij2bin(vpij(2)^200)
ans =
100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

vpij2bin(vpij(2)^200 - 1)
ans =
11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

We can see what happens by taking a high precision log of those numbers. Thus, accurate to 100 decimal places,

log2(hpf(2,100)^200)
ans =
200

log2(hpf(2,100)^200 - 1)
ans =
199.9999999999999999999999999999999999999999999999999999999999991022086719253476184905817230522465495

The difference between those two numbers is very small.

log10(hpf(2,100)^200) - log10(hpf(2,100)^200 - 1)
ans =
2.702621195974725251000559400026211938865e-61

So that any computation using logs must fail here, unless a high precision log is itself taken. At best, you can come within a digit of being correct, but no more than that. So if your goal is merely to allocate sufficient space for the number, then always allocate one more digit than apparently needed. This should be sufficient until you start working with REALLY huge powers.

(VPIJ is a new variable precision integer form in MATLAB, that will directly replace my older VPI tool. HPF is available already on the file exchange.)

4

You can get an exact answer without using logarithms. Walk up the radixes of the arbitrary base until the number fits inside.

Python example:

def count_digits(number, base):
    radix = 1
    while number >= base ** radix:
        radix += 1
    return radix
Robert K. Bell
  • 9,350
  • 2
  • 35
  • 50