1

I'm working with integers greater up to 128 bits long. They're used to store a large set of flags and I don't need to do any weird math with them, so I can treat it like a string in php to get around the PHP_INT_MAX limit.

I think want to store these numbers in a BINARY column in mysql. The column will need between 4-16 bytes to hold the numbers.

My understanding is that the BINARY column in php is a binary string, does this mean I'll be wasting space by not using a-z as part of the character set? Should I be using a different type of column? Do I need to base_convert in php to get the full use of the character set?

How do I get my string representation of a 128 bit integer in php stored the most efficiently into a 128 bit column in php?

Also, if roughly half of the integers I'm storing with only need 4 bytes, would I be better off using a VARBINARY column?

Matthew
  • 6,351
  • 8
  • 40
  • 53
  • Will you be performing calculations/computations on this value with MySQL? If not, you could use varchar(128) – Mike B Apr 02 '13 at 23:05
  • 2
    I could use a varchar(128), but that would use much more space than I need because each decimal digit would take up 8 bits. I want to store it purely as a binary data to get the most out of the space. – Matthew Apr 03 '13 at 14:14
  • 1
    Out of curiosity, why the space saving requirement in the first place? If you manage to fill a terabyte-sized disk, you'll have other issues to worry about which won't be space utilisation. – N.B. Apr 03 '13 at 14:41
  • 1
    @N.B. Same reason you wouldn't use a text column to store a zip code. Why would you choose to use more space than is necessary? It's not too complicated to achieve, I knew it was possible, and it's a good learning experience to find out the best way to do something. It's not just about using more space than is necessary, it's about wasting space using a character set supporting characters I'll never use. – Matthew Apr 03 '13 at 15:24
  • @Matthew - storage engines don't work as efficiently as you might believe and they have internal reasons for that - bottom line, you *might* not get what you think you will. I wouldn't choose to *use* more space, I'd choose not to worry about space requirements and that's a different deal altogether. – N.B. Apr 03 '13 at 15:28
  • @N.B. - Are you suggesting I use a different data type to store these flags? If so, what do you think would be better and why? – Matthew Apr 03 '13 at 16:40
  • Re *"BINARY column in php is a binary string"*. No, BINARY is raw bytes. If you send it in as a hex code, each two characters in hex is stored in a single byte in BINARY. So Binary(16) will perfectly store your 8x16 = 128-bit integer. Its hard to say whether to use VarBinary - in many ways a DB is more efficient with fixed-length columns, but at 4 bytes vs 16 bytes, it might be. Don't even consider using CHAR or VARCHAR - more space, no benefit. – ToolmakerSteve Apr 07 '19 at 16:42

2 Answers2

2

Assuming you will be using the full range of 128 bits, each number equally likely, the most space-efficient you can be is storing 128/8 = 16 chars per number.

Conversion between this binary representation and a PHP string is a little problematic though. To decode a binary string into a base-16 number you can use unpack:

$numberInHex = unpack("H*", $binaryData);

If you must convert the output to or from decimal numbers you'll have to use gmp or bc.

Update: An example:

> create table binary_test ( int128 binary(16) );
> insert into binary_test set int128 = 0x11223344556677889900112233445566;
> select hex(int128) from binary_test;
+----------------------------------+
| hex(int128)                      |
+----------------------------------+
| 11223344556677889900112233445566 |
+----------------------------------+

Instead of 0x1122... you could also use unhex('1122..').

Joni
  • 108,737
  • 14
  • 143
  • 193
  • I don't ever need a decimal representation. When I check the flags I can take substrings and only need up to 4 bytes at a time, so using it in hex is fine. So if I insert into mysql from php, I need to send the data in hex format and that will store it correctly in mysql? – Matthew Apr 03 '13 at 14:20
  • 1
    Hexadecimal is probably easiest to work with, yes. Updating the answer with an example. – Joni Apr 03 '13 at 14:30
  • 2
    Perfect, just what I was looking for. Just a note for anyone else who finds this, another way to insert hex is preceding a quoted hex value with 'x': `insert into binary_test set int128 = x'123456789ABCDEF';`. – Matthew Apr 03 '13 at 14:43
0

If you save the integers in binary format (example: 45 -> 00101101(bin), 0x2D(hex)) you would not waste any space, because you would not use characters but bits.

As always it depends how much data you have. If you are talking about a few thousand numbers, you don't need to worry about varbinary. On the other hand, if you have a few million/billion records then it is worth doing some optimizations.

bssstudio
  • 174
  • 3