Most efficient way to get digit count of arbitrarily big number

Question

What is the most efficient way to get the digits of a number?

Lets begin with an example:

Imagine the Fibonacci sequence. Now lets say we want to know which Fibonacci number is the first to have 1000 digits (in base 10 representation). Up to 308 digits (1476th Fibonacci number) we can easily do this by using logBase 10 <number>. If the number is greater than the 1476th Fibonacci number, logBase will return Infinity and the calculation will fail. The problem is that 308 is somewhat far away from 1000, which was our initial goal.

A possible solution is to convert the number we want to know the number of digits of to a string and use it's length to determine the digit count. This is a little bit inefficient for my purposes because trying this with 10000 takes its sweet time.

The most efficient method shown in other questions is hardcoding all possible cases which I really do not want to do, especially because the number of digits exceeds 10 as needed in the proposed solutions.

So to come back to my question: What is the best (most efficient) way to determine a base 10 numbers digit count? Is it really converting it to a string and using its length or are there any "hacker" tricks like 0x5f3759df?

Note: I appreciate solutions in any language, even if this is tagged "haskell".

@user2864740 Yes, indeed it is the maximum possible value of a 64-bit double. Haskell however supports arbitrarily long numbers (using ``Integer``) which is why my question includes even greater numbers. — ThreeFx, Jul 28 '14 at 23:16

bheklilr · Accepted Answer · 2014-07-28T23:57:26.253

8

Why not use div until it's no longer greater than 10?

digitCount :: Integer -> Int
digitCount = go 1 . abs
    where
        go ds n = if n >= 10 then go (ds + 1) (n `div` 10) else ds

This is O(n) complexity, where n is the number of digits, and you could speed it up easily by checking against 1000, then 100, then 10, but this will probably be sufficient for most uses.

For reference, on my not-so-great laptop running it only in GHCi and using the horribly inaccurate :set +s statistics flag:

> let x = 10 ^ 10000 :: Integer
> :force x
<prints out 10 ^ 10000>
> digitCount x
10001
it :: Int
(0.06 secs, 23759220 bytes)

So it seems pretty quick, it can churn through a 10001 digit number in less than a 10th of a second without optimizations.

If you really wanted the O(log(n)) complexity, I would recommend writing your own version where you divide by 2 each time, but that one is a little more involved and trickier than dividing by 10. For your purposes this version will easily compute the number of digits up to about 20000 digits without problems.

edited Jul 28 '14 at 23:57

answered Jul 28 '14 at 23:34

bheklilr

53,530
6
107
163

Forgot about that method, it was probably too late yesterday :D Anyhow, how would you do this for numbers **even greater** than ``10^10000`` in the Fibonacci sequence? Isn't there a way to easily determine the log10 of a function of the form ``a^x``, where for the Fibonacci numbers ``a`` equals the golden ratio? – ThreeFx Jul 29 '14 at 11:56
@ThreeFx I think you're confused a bit, because `phi ^ x` is not going to be a fibonacci number. The relationship of the golden ratio to the fibonacci numbers is that the ratio between adjacent elements of the fibonacci sequence approaches `phi` as you approach infinity. If you want to speed it up, probably the best you'll get is figuring out the precise details to know that `digitCount n` is approximately equal to `2 * digitCount (2 * sqrt n)`, and instead of dividing a few million times you'd just end up taking the square root a few hundred instead. – bheklilr Jul 29 '14 at 12:26
1

As an "optimization", instead of dividing by 10 each time, you could divide by 10^300, and once you get something smaller than 10^300 use `logBase 10`. I don't know if that would be faster, I guess it depends on how quick the `logBase` function is. – Omar Antolín-Camarena Jul 29 '14 at 15:40

David Young · Answer 2 · 2014-07-29T17:31:55.913

If you just want to find the first number with at least digitCount digits in a list, you could test each number in O(1) by checking if fibBeingTested >= 10^{digitCount - 1}. This works since 10^{digitCount - 1} is the lowest number with at least digitCount digits:

import Data.List (find)

fibs :: [Integer]
-- ...

findFib :: Int -> Integer
findFib digitCount =
  let Just solution = find (>= tenPower) fibs
  in
  solution
  where
    tenPower = 10 ^ (digitCount - 1)

We use digitCount - 1 because 10^1, for instance, is 10 which has two digits.

As a result of the O(1) complexity that this comparison has, you can find Fibonacci numbers very quickly. On my machine:

λ> :set +s
λ> findFib 10000
[... the first Fibonacci number with at least 10,000 digits ...]
(0.23 secs, 121255512 bytes)

If the list of fibs has already been computed up to the 10,000th digit Fibonacci (for example, if you run findFib 10000 twice) it's even faster, which shows that more computation is taking place in calculating each Fibonacci number than in finding the one you're looking for:

λ> findFib 10000   -- Second run of findFib 10000
[... the first Fibonacci number with at least 10,000 digits ...]
(0.04 secs, 9922000 bytes)

score 1 · Answer 3 · answered Jul 29 '14 at 01:55

For just getting up to a Fibonacci number that has more than 1000 digits, length . show (on Integer) suffices.

GHCi> let fibs = Data.Function.fix $ (0:) . scanl (+) 1
GHCi> let digits = length . (show :: Integer -> String)
GHCi> :set +t +s
GHCi> fst . head . dropWhile ((1000>) . digits . snd) $ zip [0..] fibs
4782
it :: Integer
(0.10 secs, 149103264 bytes)

For floating point numbers (so you can use logBase) outside the range of Double look to the numbers package. They are down-right slow, but you do have to pay something for that type of accuracy.

score 0 · Answer 4 · answered Jul 29 '14 at 16:28

You could always try binary search to find the number of digits of n: first find a k such that 10^2^k ≥ n, and then divide n succesively by 10^2^(k-1), 10^2^(k-2), ..., 10^2^0:

numDigits n = fst $ foldr step (1,n) tenToPow2s
  where
    pow2s = iterate (*2) 1
    tenToPow2s = zip pow2s . takeWhile (<=n) . iterate (^2) $ 10
    step (k,t) (d,n) = if n>=t then (d+k, n `div` t) else (d,n)

For the specific case of Fibonacci numbers you could also just try math: the n-th Fibonacci number F(n) is between (φ^n-1)/√5 and (φⁿ+1)/√5 so for the base 10 logarithm we have:

log(F(n)) - n log(φ) + log(√5) ∈ [log(1 - 1/φⁿ), log(1 + 1/φⁿ)]

That interval gets tiny right away.

Most efficient way to get digit count of arbitrarily big number

4 Answers4