1

Is it possible to efficiently store big ints in MySQL? Not big as in bigint, more like 1000th Catalan number or so, and there are at least hundreds of them. More likely, thousands. The speed of storing and loading values is a priority.

The numbers are created using gmpy2, Python's GMP wrapper. So it'd be nice if the stored numbers were accessible in the same format, but if not, I can always write a converter.

Synedraacus
  • 975
  • 1
  • 8
  • 21
  • 1
    There are various ways you can do it. One way is ,You can store number as a text (so 1000 places long number = 1000 char size word) and then convert back to int when needed. – Anil_M Mar 22 '18 at 04:31
  • if bigint is too big for you, use mediumint or int, never try to use strings for storing numbers, it will be _much_ slower. If you exceed bigint consider moving to two bigint columns and adding logic to split big number into 2 bigints – Alexey Mar 22 '18 at 06:52
  • @Alexey How *much* slower will it be? – Strawberry Mar 22 '18 at 07:29
  • @Strawberry int is 4 bytes, mediumint is 3 bytes, bigint is 8 bytes. Number in string consumes number of digits + 2 or 4 bytes for string size. So for 1000 digit number it's 1000 times slower ;) – Alexey Mar 22 '18 at 07:33
  • @Alexey and don't forget the added cost of operations on strings-as-ints or conversion to int. Those are about an order of magnitude slower than GMP arbitrary precision ints. Not precisely an SQL problem, but still a part of total runtime. – Synedraacus Mar 23 '18 at 08:12

1 Answers1

1

I would convert the mpz integer to a Python string (for v2) or bytes (for v3) and save it as a BLOB entity in mysql. Here are the relevant gmpy2 commands using Python 2.

>>> import gmpy2
>>> gmpy2.mpz(1234)
mpz(1234)
>>> gmpy2.to_binary(mpz(1234))
'\x01\x01\xd2\x04'
>>> gmpy2.from_binary('\x01\x01\xd2\x04')
mpz(1234)
>>> 

gmpy2.to_binary will convert an arbitrary gmpy2 object to a binary string. gmpy2.from_binary will convert the binary string back to a gmpy2 object. The object's type is encoded into the binary string so you don't need to track the object's type.

Disclaimer: I maintain gmpy2.

casevh
  • 11,093
  • 1
  • 24
  • 35