11

Is there any fast way to find the largest power of 10 smaller than a given number?

I'm using this algorithm, at the moment, but something inside myself dies anytime I see it:

10**( int( math.log10(x) ) ) # python
pow( 10, (int) log10(x) )   // C

I could implement simple log10 and pow functions for my problems with one loop each, but still I'm wondering if there is some bit magic for decimal numbers.

peoro
  • 25,562
  • 20
  • 98
  • 150
  • What's wrong with this? It takes .0161 microseconds on average to perform `10**(int(math.log10(987654321987654321)))`, which is actually rather impressive. – Rafe Kettler Dec 22 '10 at 21:07
  • 4
    are we talking about integers or floats? – aronp Dec 22 '10 at 21:24
  • The fastest algorithm is probably going to be dependent on the size of `x`. You'll probably want to profile a hybrid approach. – Nick Larsen Dec 22 '10 at 22:01
  • In my case `x` is an integer, but I guess it shouldn't change that much (could cast a float `x` in an integral value and vice versa). – peoro Dec 23 '10 at 11:32
  • my answer was faster than others theoretically, It's useful for big numbers not 10 digits. – Saeed Amiri Dec 25 '10 at 05:36

8 Answers8

7

An alternative algorithm is:

i = 1;
while((i * 10) < x)
    i *= 10;
Anon.
  • 58,739
  • 8
  • 81
  • 86
  • 1
    That's a terrible approach in terms of computational complexity. Your method is O(n) in terms of the number of bits, whereas O(log(n)) is sufficient. – Yonatan N Dec 22 '10 at 22:14
  • 8
    Everything is a tradeoff. For numbers with a finite range, O(n) could very well be just fine, and this option uses no additional space (compared with having a large lookup table, precomputing a sequence of squares, etc.). Computational complexity is not the be-all and end-all of whether an algorithm is "good" or not. It would be relatively straightforward to modify this to square instead of naive-multiply until it gets close, but that would involve more code and might, in fact, slow it down on smaller inputs. The right choice depends on the expected input data, not solely on big-O performance. – Anon. Dec 22 '10 at 22:23
  • 1
    it's better do it while((i = i * 10) < x); or while (i – Saeed Amiri Dec 23 '10 at 09:59
  • Computational complexity is not the be-all and end-all - sure, but in this case he's specifically asked for the fastest way... – Mark Mayo Dec 23 '10 at 13:48
  • Most (all?) of the answers I got are a lot cheaper than my old code. This solution is the simplest one to be implemented, clean and really easy to be understood. I'm implementing this one since I think that for my average case (at the moment) it won't need to iterate more than 4 or 5 times, thus a fast O(n) solution should be ok; if I see it's not enough yet, I'll try and implement other algorithms as well. – peoro Dec 25 '10 at 01:50
6

Log and power are expensive operations. If you want fast, you probably want to look up the IEEE binary exponent in table to get the approximate power of ten, and then check if the mantissa forces a change by +1 or not. This should be 3 or 4 integer machine instructions (alternatively O(1) with a pretty small constant).

Given tables:

  int IEEE_exponent_to_power_of_ten[2048]; // needs to be 2*max(IEEE_exponent)
  double next_power_of_ten[600]; // needs to be 2*log10(pow(2,1024)]
  // you can compute these tables offline if needed
  for (p=-1023;p>1023;p++) // bounds are rough, see actual IEEE exponent ranges
  {  IEEE_exponent_to_power_of_ten[p+1024]=log10(pow(2,p)); // you might have to worry about roundoff errors here
     next_power_of_ten[log10(pow(2,p))+1024]=pow(10,IEEE_exponent_to_power_of_ten[p+1024]);
  }

then your computation should be:

  power_of_ten=IEEE_exponent_to_power_of_10[IEEE_Exponent(x)+1023];
  if (x>=next_power_of_ten[power_of_ten]) power_of_ten++;
  answer=next_power_of_ten[power_of_ten];

[You might really need to write this as assembler to squeeze out every last clock.] [This code not tested.]

However, if you insist on doing this in python, the interpreter overhead may swamp the log/exp time and it might not matter.

So, do you want fast, or do you want short-to-write?

EDIT 12/23: OP now tells us that his "x" is integral. Under the assumption that it is a 64 (or 32) bit integer, my proposal still works but obviously there isn't an "IEEE_Exponent". Most processors have a "find first one" instruction that will tell you the number of 0 bits on the left hand (most significant) part of the value, e.g., leading zeros; you likely This is in essence 64 (or 32) minus the power of two for the value. Given exponent = 64 - leadingzeros, you have the power of two exponent and most of the rest of the algorithm is essentially unchanged (Modifications left for the reader).

If the processor doesn't have a find-first-one instruction, then probably the best bet is a balanced discrimination tree to determine the power of ten. For 64 bits, such a tree would take at most 18 compares to determine the exponent (10^18 ~~ 2^64).

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • No, I'm not doing that in python -this was supposed a language independent algorithm- just wanted to show my current algorithm in a couple of languages to make it clearer, bad idea, probably... – peoro Dec 22 '10 at 21:11
  • 2
    *Log and power are expensive operations.* This is obviously the case in the absence of hardware support for transcendental operations. But, how slow are we talking on consumer grade general purpose CPU (modern x86, Power, x86_64...)? – dmckee --- ex-moderator kitten Dec 22 '10 at 21:42
  • If you don't want to use floating point, a similar integer scheme works by using a count-leading-zeros assembly instruction. – Keith Randall Dec 23 '10 at 00:04
  • @dmckee: Even with built in floating point, the transcendentals take some time execute. The chip vendors don't want to tell you precise clock counts on anything anymore but it is hard to believe you implement a high-precision floating point long in a few clocks. My scheme, being only few integer instructions (even the apparant floating compare can actually done with integer instructions because of how IEEE is defined!) seems likely hard to beat. Is this worth it? Probably not, unless the OP is doing just this repeatedly in an inner loop. But he didn't ask if it was worth the trouble :-} – Ira Baxter Dec 23 '10 at 06:26
  • Nice solution! I'm not going to implement this now: it seems quite complicated, but I will, in case will need to optimize even more the solution I chose. – peoro Dec 25 '10 at 01:46
4

Create an array of powers of 10. Search through it for the largest value smaller than x.

If x is fairly small, you may find that a linear search provides better performance than a binary search, due in part to fewer branch mis-predictions.

Dijkstra
  • 2,490
  • 3
  • 21
  • 35
3

The asymptotically fastest way, as far as I know, involves repeated squaring.

func LogFloor(int value, int base) as int
    //iterates values of the form (value: base^(2^i), power: 2^i)
    val superPowers = iterator
                          var p = 1
                          var c = base
                          while c <= value
                              yield (c, p)
                              c *= c
                              p += p
                          endwhile
                      enditerator

    //binary search for the correct power
    var p = 0
    var c = 1
    for val ci, pi in superPowers.Reverse()
        if c*ci <= value
            c *= ci
            p += pi
        endif
    endfor

    return p

The algorithm takes logarithmic time and space in N, which is linear to N's representation size. [The time bound is probably a bit worse because I simplified optimistically]

Note that I assumed arbitrarily large integers (watch out for overflow!), since the naive times-10-until-over algorithm is probably fast enough when dealing with just 32-bit integers.

Craig Gidney
  • 17,763
  • 5
  • 68
  • 136
  • 1
    @Yonatan The indentation is intended. It keeps the contents of the iterator block to the right of the block delimiters. – Craig Gidney Dec 22 '10 at 22:20
  • As for the previous solution, it's a nice one! I'm not going to implement this now: it seems quite complicated, but I will, in case will need to optimize even more the solution I chose. – peoro Dec 25 '10 at 01:47
1

In Python:

10**(len(str(int(x)))-1)

HaPsantran
  • 5,581
  • 6
  • 24
  • 39
1

I think the fastest way is O(log(log(n))^2), the while loop takes O(log(log(n)) and it can be recursive call finite time (we can say O(c) where see is constant), first recursive call is takes log(log(sqrt(n))) time second takes .. and the number of sqrt in sqrt(sqrt(sqrt....(n)) < 10 is log(log(n)) and constant, because of machine limitations.

    static long findPow10(long n)
    {
        if (n == 0)
            return 0;

        long i = 10;
        long prevI = 10;
        int count = 1;

        while (i < n)
        {
            prevI = i;
            i *= i;
            count*=2;
        }

        if (i == n)
            return count;

        return count / 2 + findPow10(n / prevI);
    }
Saeed Amiri
  • 22,252
  • 5
  • 45
  • 83
0

Given that this is language independent, if you can get the power of two that this number is significant to, eg y in x*2^y (which is the way the number is stored, though I'm not sure I have seen an easy way to access y in any language I have used) then if

z = int(y/(ln(10)/ln(2))) 

(one floating point division)

10^z or 10^(z+1) will be your answer, though 10^z is still is not so simple (beg to be corrected).

aronp
  • 799
  • 2
  • 6
  • 14
  • If you can get the power of two, you can simply look up z in table (my answer) at the cost of one memory access which is much cheaper than a float divide. – Ira Baxter Dec 31 '10 at 00:13
0

I timed the methods with the following variations in C++ for the value a being a size_t type (inlining improves performance but does not change relative ordering).

Try 1: Multiply until find number.

size_t try1( size_t a )
{
  size_t scalar = 1ul;
  while( scalar * 10 < a ) scalar *= 10;
  return scalar;
}

Try 2: Multiway if (could also be programmed using a lookup table).

size_t try2( size_t a )
{
  return ( a < 10ul ? 1ul :
   ( a < 100ul ? 10ul :
   ( a < 1000ul ? 100ul :
   ( a < 10000ul ? 1000ul :
   ( a < 100000ul ? 10000ul :
   ( a < 1000000ul ? 100000ul :
   ( a < 10000000ul ? 1000000ul :
   ( a < 100000000ul ? 10000000ul :
   ( a < 1000000000ul ? 100000000ul :
   ( a < 10000000000ul ? 1000000000ul :
   ( a < 100000000000ul ? 10000000000ul :
   ( a < 1000000000000ul ? 100000000000ul :
   ( a < 10000000000000ul ? 1000000000000ul :
   ( a < 100000000000000ul ? 10000000000000ul :
   ( a < 1000000000000000ul ? 100000000000000ul :
   ( a < 10000000000000000ul ? 1000000000000000ul :
   ( a < 100000000000000000ul ? 10000000000000000ul :
   ( a < 1000000000000000000ul ? 100000000000000000ul :
   ( a < 10000000000000000000ul ? 1000000000000000000ul :
         10000000000000000000ul )))))))))))))))))));
 }

Try 3: Modified from findPow10 of @Saaed Amiri, which uses squaring to more rapidly find very large powers than Try 1.

size_t try3( size_t a )
{
  if (a == 0)
    return 0;
  size_t i, j = 1;
  size_t prev = 1;
  while( j != 100 )
  {
    i = prev;
    j = 10;
    while (i <= a)
    {
      prev = i;
      i *= j;
      j *= j;
    }
  }
  return prev;
}

Try 4: Lookup table indexed using count leading zeros instruction as per @Ira Baxter.

static const std::array<size_t,64> ltable2{
1ul, 1ul, 1ul, 1ul, 1ul, 10ul, 10ul, 10ul,
100ul, 100ul, 100ul, 1000ul, 1000ul, 1000ul,
1000ul, 10000ul, 10000ul, 10000ul, 100000ul,
100000ul, 100000ul, 1000000ul, 1000000ul,
1000000ul, 1000000ul, 10000000ul, 10000000ul,
10000000ul, 100000000ul, 100000000ul,
100000000ul, 1000000000ul, 1000000000ul,
1000000000ul, 1000000000ul, 10000000000ul,
10000000000ul, 10000000000ul, 100000000000ul,
100000000000ul, 100000000000ul, 1000000000000ul,
1000000000000ul, 1000000000000ul, 1000000000000ul,
10000000000000ul, 10000000000000ul, 10000000000000ul,
100000000000000ul, 100000000000000ul, 100000000000000ul,
1000000000000000ul, 1000000000000000ul, 1000000000000000ul,
1000000000000000ul, 10000000000000000ul, 10000000000000000ul,
10000000000000000ul, 100000000000000000ul, 100000000000000000ul,
100000000000000000ul, 100000000000000000ul, 1000000000000000000ul,
1000000000000000000ul };
size_t try4( size_t a )
{
  if( a == 0 ) return 0;
  size_t scalar = ltable2[ 64 - __builtin_clzl(a) ];
  return (scalar * 10 > a ? scalar : scalar * 10 );
}

Timing is as follows (gcc 4.8)

for( size_t i = 0; i != 1000000000; ++i) try1(i)    6.6
for( size_t i = 0; i != 1000000000; ++i) try2(i)    0.3
for( size_t i = 0; i != 1000000000; ++i) try3(i)    6.5
for( size_t i = 0; i != 1000000000; ++i) try4(i)    0.3
for( size_t i = 0; i != 1000000000; ++i) pow(10,size_t(log10((double)i)))
                                                   98.1

The lookup/multiway-if beats everything in C++, but requires we know integers are a finite size. try3 is slower than try1 in this test for smaller values of the loop end value, for large numbers try3 beats try1. In python things are made difficult because integers are not limited so I would combine try2 with try3 to quickly process numbers up to a fixed limit then handle the possibly very large numbers.

In python I think lookup using a list comprehension is probably faster than a multiway-if.

# where we previously define lookuptable = ( 1, 10, 100, ..... )
scalar = [i for i in lookuptable if i < a][-1]
Justin Finnerty
  • 329
  • 1
  • 7
  • 1
    When you need to write code that starts to look like ASCII art of a sailboat, you should be asking yourself if you *really* need this kind of optimization. – Victor Zamanian Apr 15 '20 at 11:00