How can I find out how much storage it would take to store every number between 1 and 2^100?

Question

I am complete beginner when it comes to working with large numbers in code and I know this is definitely the wrong approach but this is what I started with.

import tqdm

try:
    total = 0
    for num in tqdm.tqdm(range(2**100), total=2**100):
        total += len(bin(num)) - 2
finally:
    with open('results.txt', 'w') as file:
        file.write(f'{total=}')

The result I got was:

0%|                  | 87580807/1267650600228229401496703205376 [00:39<159887459362604133471:34:24, 2202331.37it/s]

Obviously this approach is going to take way too long. I know I could try making this multi-core but I don't think this is going to make much of a difference in the speed.

What are my options here?

Will using another language like C significantly increase the speed so that it will take days or hours instead of eons? Is there another approach/algorithm I can use?

Is this a programming quiz question? Usually, you aren't supposed to actually store it, merely calculating how much it would take. — Martheen, Sep 10 '21 at 04:44
It depends on what format you want to store it in. Suppose you want fix sized bits. You need something bigger than a 64 bit int. You could just choose a 128 bit int, which is 16 bytes and the math is easy. — tdelaney, Sep 10 '21 at 04:44
@rv.kvetch - [tqdm](https://github.com/tqdm/tqdm) is a progress bar for python. Instead of `for num in range(..)` you run it through tqdm and you get ascii art showing progress. — tdelaney, Sep 10 '21 at 04:47
The numbers from `2**99` (inclusive) to `2**100` (exclusive) all take exactly 100 bits. Explicitly looping over all those numbers to count their bits is simply absurd, just multiply 100 by the size of the range (and similarly for all the previous power-of-2 ranges). — jasonharper, Sep 10 '21 at 04:49
Suppose you could write 1 billion a second, which is about `2**30`. So, now you just need `2**70` seconds. With 31536000 seconds in a year, you could crank this out in 37,436,314,710,724 years. If you want to get it done in a hundred years you could farm it out to 37,436,314,710 computers. That's even bigger than facebook. — tdelaney, Sep 10 '21 at 04:58

Andres · Accepted Answer · 2021-09-10T06:18:06.047

Ok I figured it out. I used @jasonharper's approach.

So the code would be following:

total = 0
for power in range(1, 101):
    total += ((int('1' * power, base=2) - int('1' + '0' * (power - 1), base=2)) + 1) * power

total was equal to 125497409422594710748173617332225, which represents the number of bytes needed to store every number between 1 and 2^100.

For some context it would take ≈425414947195.2363 times the total storage capacity of the Earth to store all numbers between 1 and 2^100.

Reference: https://www.zdnet.com/article/what-is-the-worlds-data-storage-capacity/

Epsi95 · Answer 2 · 2021-09-10T06:34:47.037

Interesting problem, but not all problems should be solved using brute force, there comes the part of the algorithm. Looking at your problem, it seems you want to count the number os bits required till some n. Now if we look closely,

number of bits    total number of numbers we can represent
1                  2**1 = 2
2                  2**2 = 4
3                  2**3 = 8
4                  2**4 = 16
5                  2**5 = 32
...

So the sum is like

1*2 + 2*2 + 3*2^2 + 4*2^3 + ...
= 1 + 1*2^0 + 2*2^1 + 3*2^2 + 4*2^3 + ...
= 1 + sum(k*2^(k-1)) with n from 1 to number of bits
= 1 + (k*2^k - 2^k +1)
= k*2^k - 2^k + 2

So there is a geometric progression visible. Using the summation methioned above you can determine the formula

import math

def log2(x):
    return math.log(x) / math.log(2)

def count_the_total_number_of_bits_till(n):
    neares_2_to_the_power = math.floor(log2(n))
    actual_number_of_bits_required = math.ceil(log2(n))
    
    sum_1 = ((neares_2_to_the_power * (2**neares_2_to_the_power)) - (2**neares_2_to_the_power) + 2)
    extra_sum = ((n - 2**neares_2_to_the_power) * (actual_number_of_bits_required))
    
    return sum_1 + extra_sum

count_the_total_number_of_bits_till(2**10)

what you were doing

sum_ = 0
for i in range(2**10): 
#equivalent to 
# count_the_total_number_of_bits_till(2**10)
    sum_ += len(bin(i)[2:])

print(sum_)

How can I find out how much storage it would take to store every number between 1 and 2^100?

2 Answers2