6

I am building an encryption program which produces a massive integer.It looks something like this:

a = plaintextOrd**bigNumber

when i do

a = str(a)

it takes over 28 minutes.

Is there any possible way to convert an integer like this quicker that using the built in str() function?

the reason i need it to be a string is because of this function here:

def divideStringIntoParts(parts,string):
    parts = int(parts)
    a = len(string)//parts

    new = []
    firstTime = True
    secondTime = True
    for i in range(parts):
        if firstTime:
            new.append(string[:a])
            firstTime = False
        elif secondTime:
            new.append(string[a:a+a])
            secondTime = False
        else:
            new.append(string[a*i:a*(i+1)])

    string2 = ""
    for i in new:
        for i in i:
            string2 += i

    if len(string2) - len(string) != 0:
        lettersNeeded = len(string) - len(string2)
        for i in range(lettersNeeded):
            new[-1] += string[len(string2) + i] 

    return new

Matthew Tranmer
  • 414
  • 4
  • 12
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/202959/discussion-on-question-by-matthew-tranmer-is-it-possible-to-convert-a-really-lar). – Samuel Liew Nov 24 '19 at 02:30

3 Answers3

4

You wrote in the comments that you want to get the length of the integer in decimal format. You don't need to convert this integer to a string, you can use "common logarithm" instead:

import math
math.ceil(math.log(a, 10))

Moreover, if you know that:

a = plaintextOrd**bigNumber

then math.log(a, 10) is equal to math.log(plaintextOrd, 10) * bigNumber, which shouldn't take more than a few milliseconds to calculate:

>>> plaintextOrd = 12345
>>> bigNumber = 67890
>>> a = plaintextOrd**bigNumber
>>> len(str(a))
277772
>>> import math
>>> math.ceil(math.log(a, 10))
277772
>>> math.ceil(math.log(plaintextOrd, 10) * bigNumber)
277772

It should work even if a wouldn't fit on your hard drive:

>>> math.ceil(math.log(123456789, 10) * 123456789012345678901234567890)
998952457326621672529828249600

As mentioned by @kaya3, Python standard floats aren't precise enough to describe the exact length of such a large number.

You could use mpmath (arbitrary-precision floating-point arithmetic) to get results with the desired precision:

>>> from mpmath import mp
>>> mp.dps = 1000
>>> mp.ceil(mp.log(123456789, 10) * mp.mpf('123456789012345678901234567890'))
mpf('998952457326621684655868656199.0')
Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
  • This is an excellent application of mathematics to make a seemingly big problem tiny; +1. However, for numbers of this size, `math.log` won't give exact enough answers because the actual number of digits is outside the range where floating point numbers are accurate to the nearest integer. If `x` is your output here, then `float(x) == float(x+1)`. So this code doesn't actually give the same answer as the naive `len(str(x))` would if it ever finished running. – kaya3 Nov 24 '19 at 05:47
  • Fantastic - I haven't seen `mpmath` before but it looks like an excellent library for these use-cases where a problem about very large integers has a neat solution via real numbers. – kaya3 Nov 24 '19 at 16:13
  • @kaya3: I think it's part of [sympy](https://docs.sympy.org/0.6.7/modules/mpmath/basics.html), which only has awesome modules. – Eric Duminil Nov 24 '19 at 16:14
3

Some quick notes on the "I need it for this function".

  • You don't need the first/second logic:
    • [:a] == [a*0:a*(0+1)]
    • [a:a+a] == [a*1:a*(1+1)]

So we have

    new = []
    for i in range(parts):
        new.append(string[a*i:a*(i+1)])

or just new = [string[a*i:a*(i+1)] for i in range(parts)].

Note that you have silently discarded the last len(string) % parts characters.

In your second loop, you shadow i with for i in i, which happens to work but is awkward and dangerous. It can also be replaced with string2 = ''.join(new), which means you can just do string2 = string[:-(len(string) % parts)].

You then see if the strings are the same length, and then add the extra letters to the end of the last list. This is a little surprising, e.g. you would have

>>> divideStringIntoParts(3, '0123456789a')
['012', '345', '6789a']

When most algorithms would produce something that favors even distributions, and earlier elements, e.g.:

>>> divideStringIntoParts(3, '0123456789a')
['0124', '4567', '89a']

Regardless of this, we see that you don't really care about the value of the string at all here, just how many digits it has. Thus you could rewrite your function as follows.

def divide_number_into_parts(number, parts):
    '''
    >>> divide_number_into_parts(12345678901, 3)
    [123, 456, 78901]
    '''
    total_digits = math.ceil(math.log(number + 1, 10))
    part_digits = total_digits // parts
    extra_digits = total_digits % parts

    remaining = number
    results = []
    for i in range(parts):
        to_take = part_digits
        if i == 0:
            to_take += extra_digits
        digits, remaining = take_digits(remaining, to_take)
        results.append(digits)
    # Reverse results, since we go from the end to the beginning
    return results[::-1]


def take_digits(number, digits):
    '''
    Removes the last <digits> digits from number.
    Returns those digits along with the remainder, e.g.:
    >>> take_digits(12345, 2)
    (45, 123)
    '''
    mod = 10 ** digits
    return number % mod, number // mod

This should be very fast, since it avoids strings altogether. You can change it to strings at the end if you'd like, which may or may not benefit from the other answers here, depending on your chunk sizes.

Cireo
  • 4,197
  • 1
  • 19
  • 24
2

Faster than function str conversion of int to str is provided by GMPY2

Source of Example Below

import time
from gmpy2 import mpz

# Test number (Large)
x = 123456789**12345

# int to str using Python str()
start = time.time()
python_str = str(x)
end = time.time()

print('str conversion time {0:.4f} seconds'.format(end - start))

# int to str using GMPY2 module
start = time.time()
r = mpz(x)
gmpy2_str = r.digits()
end = time.time()

print('GMPY2 conversion time {0:.4f} seconds'.format(end - start))
print('Length of 123456789**12345 is: {:,}'.format(len(python_str)))
print('str result == GMPY2 result {}'.format(python_str==gmpy2_str))

Results (GMPY2 was 12 times faster in test)

str conversion time 0.3820 seconds
GMPY2 conversion time 0.0310 seconds
Length of 123456789**12345 is: 99,890
str result == GMPY2 result True
DarrylG
  • 16,732
  • 2
  • 17
  • 23