-2

Suppose I have a 64 byte signature (from ed25519) that one party creates. This party must compress the signature further, so that it is 4-8 digits in base 2048. Then, the second party must be able to recreate the signature from the data.

Here is an example of a decimal signature: 5670805304946899675614751184947294808143702505785021095830828785725573127924144977212837580418240432902375737987653828318622222068237988634991262293689098

How can I compress this signature to around 4 digits in base 2048? Is this possible using Sudoku compression?

Community
  • 1
  • 1
Jon
  • 2,932
  • 2
  • 23
  • 30
  • Some quick questions... What are your criteria for human-readability? Also, are you trying to compress the original 64-byte value, or the decimal representation of that 64-byte value? Should there be any security involved (e.g. is the random number supposed to increase security in some way?) Does "compressed" include the bytes used for the secret shared random number? Is that random number different each time? – templatetypedef Jan 12 '15 at 22:06
  • 1
    How do you know that the signature is compressible (has entropy)? – 500 - Internal Server Error Jan 12 '15 at 22:08
  • @templatetypedef Human-readability isn't a big deal, I've updated my question to reflect this. It depends on which is easiest to compress, if compressing the original 64-byte value works, I'm all for it. The shared random number is not a secret, security is not a factor here; we're trying to just recreate the signature. The shared random number is actually a timestamp divided by 30, so it's different every 30 seconds. – Jon Jan 12 '15 at 22:19
  • @Jon Can you elaborate more on what exactly you're trying to do here? What you're describing seems pretty unusual - signatures are usually incompressible by design. – templatetypedef Jan 12 '15 at 22:20
  • @templatetypedef I'm attempting to create a short token that represents the signature, which the second party can use to recreate the original signature. – Jon Jan 12 '15 at 22:29
  • 3
    So you're trying to compress 512 bits down to 88? Anybody suggest that the task might be impossible? – Mark Ransom Jan 12 '15 at 22:40
  • @Jon Hashes are highly unlikely to be compressible even by good compressors due to information-theoretic restrictions. Could you do something silly like just making a database of hashes keyed by easy-to-remember things? – templatetypedef Jan 12 '15 at 22:47
  • @templatetypedef No, that wouldn't fit with what I'm doing. Forgive my ignorance here, but isn't it possible to make up some arbitrary formula which produces a smaller number using the shared number and the decimal representation? Then the second party could reverse the formula in order to retrieve the original decimal representation. Is this proper in any sense? – Jon Jan 12 '15 at 22:55
  • @Jon You can do that, but in that case the user then has to still remember that extra number. I'm not sure I see why that's any better than just giving the hash back. – templatetypedef Jan 12 '15 at 23:00
  • @templatetypedef The idea is that the user doesn't need to remember the extra number, as it's derived from the current time. The number would be used by the verifying party to recreate the original decimal signature. So as an example, (I fully realise how terrible this might be) `signature / time^20` results in a much smaller number (1837316.281017832) which could be used to recreate the original signature. Though this isn't small enough, and isn't being created in any sort of proper way, it shows what I'm wanting to achieve. – Jon Jan 12 '15 at 23:11
  • @MarkRansom Yes! What are your thoughts on the method I just commented on? – Jon Jan 12 '15 at 23:29
  • 1
    I think that a simple arithmetic operation won't reduce the number of bits required, merely shift them around a bit. You might be able to find a way to reduce the problem by the number of bits in the shared number, but beyond that I stand by my initial reaction that this is impossible. – Mark Ransom Jan 12 '15 at 23:58

2 Answers2

5

I would argue this is impossible. At least the part where you say "the second party must be able to recreate the signature from the data"

The simple reason behind this is entropy, or the amount of information that is contained in each signature. First, let us take a look at how much information can be stored at maximum in each of the "formats" you describe.

  • ed25519 signature : 64 bytes, that is 512 bits (thus 2^512 possibilities, around 1.34e154)
  • 4 digits in base 2048, that is 2048^4 possibilities, and log2((2^11)^4) = 44
  • 8 digits (since you said 4-8), same reasoning, 88 bits

So that is a lot less (maximum possible) information in the 2048-based digits already. For your function to exist, that would mean that somehow, among the 2^512 possibilities, there is enough redundant information (i.e. if you know bits a and b, you have a high probability of knowing bit c, or maybe some configuration of values are completely impossible, etc.) that you could characterize all the possible outputs with 44 (or 88) bits.

Let us take a look at Shannon's source coding theorem :

N i.i.d. random variables each with entropy H(X) can be compressed into more than N H(X) bits with negligible risk of information loss, as N → ∞; but conversely, if they are compressed into fewer than N H(X) bits it is virtually certain that information will be lost.

Here the random variables are the ed25519 signatures. You are asking two things

  1. If H(<random ed25519 signatures>) can be 44 or 88.
  2. If this limit can be reached with N=1, instead of N→∞, because you want each signature to be encoded with 44 or 88 bits, not the average number of bits for N signatures to be below 44 or 88. This is a much stronger requirement.

There is definitely more entropy in an ed25519 signature than can ever be stored in 44 or 88 bits. From the site Introduction to Ed25519 :

High security level. This system has a 2^128 security target; breaking it has similar difficulty to breaking NIST P-256, RSA with ~3000-bit keys, strong 128-bit block ciphers, etc. The best attacks known actually cost more than 2^140 bit operations on average, and degrade quadratically in success probability as the number of bit operations drops.

But if your function existed, it would probably be much easier, because you suffice to have 2^44 (or 2^88) tries, each time applying the "reverse" function, to exhaustively find all collisions. Sure, we don't know how many bit operations the hypothetical reverse function costs, but at least it gives you an idea. Plus if you do this brute force attack using the birthday attack, you will need only the square root of this number of tries (thus 2^22 or 2^44).

Conversely, if you read the paper that does this attack with an average of 2^140 operations, with 2^i iterations of 2^o operations each (thus i+o=140), you could hope to find a format that reasonably enumerates all possible 64-byte signatures with 2*i bits. However, that would only apply to your first question, since the attack could take advantage of properties like some signature values happen more often than others. Then your optimal storage length of 2*i would be reached only on average, not for every value, by coding some values that happen more often with less bits than values that happen more often.

Adding to this, we read :

Small signatures. Signatures fit into 64 bytes. These signatures are actually compressed versions of longer signatures; the times for compression and decompression are included in the cycle counts reported above.

Which means that even if there was some redundancies in the larger key, they already did an extra pass of compression, and we can reasonably assume that there should be the same if not a higher density of information in the smaller key. I.e., the chances of finding redundancies are even smaller.

Thus this means that if you apply a transformation from the signature to 44-88 bits, you will lose information, pretty much taking a hash of your 64 bytes. As true as it is impossible to recreate a file you downloaded from its checksum, that would make it impossible to recreate the ed25519 signature from the hash you calculated.

Cimbali
  • 11,012
  • 1
  • 39
  • 68
3

The signature is 64 bytes, so there are 256^64 or 2^512 possible signatures. This amount of compression will be possible only if at most 2048^8 = 2^88 out of the 2^512 possible signatures are used. This seems highly unlikely to be the case with Ed25519.

Edit: The question was modified and clarified to ask if Sudoku compression would be possible here. There are 6670903752021072936960 = 2^72.49... ways to fill out a Sudoku grid, much smaller than the 9^81 = 2^256.7... ways of labeling each cell. But the same should not be the case with the signature algorithm, so no such compression is information-theoretically possible.

Charles
  • 11,269
  • 13
  • 67
  • 105