6

I want to know if RSA signatures are unique for a data.

Suppose I have a "hello" string. The method of computing the RSA signature is firstly to get the sha1 digest(these are , I know, unqiue for data), then add a header with OID and padding scheme mentioned and do some mathematical jiggle to give the signature.

Now assuming padding is same, will the signature generating by openSSL or Bouncy Castle be same?

If yes, my only fear is, won't it be easy to get back the "text"/data??

I actaully tried to do an RSA signature of some data and the signatures from OpenSSL and BC was different. I repeated it but got same signature again and again for each of them. I realized that the two signatures of the methods were different because of the difference in padding. However I am still not sure why the signatures of each of the libs are same all the time I repeat them. Can somebody please give an easy explanation?

user489152
  • 907
  • 1
  • 23
  • 42
  • 2
    sha1 digests are not unique. You can not represent every possible 161 bit value in a unique 160 bit hash, let along representing every possible message uniquely. – Wooble May 04 '11 at 12:56

3 Answers3

15

The "usual" padding scheme, described in PKCS#1 as the "old-style, v1.5" padding, is deterministic. It works like this:

  • The data to sign is hashed (e.g. with SHA-1).
  • A fixed header is added; that header is actually an ASN.1 structure which identifies the hash function which was just used to process the data.
  • Padding bytes are added (on the left): 0x00, then 0x01, then some 0xFF bytes, then 0x00. The number of 0xFF bytes is adjusted so that the resulting total length is exactly the byte length of the modulus (i.e. 128 bytes for a 1024-bit RSA key).
  • The padded value is converted to an integer (which is less than the modulus), which goes through the modular exponentiation which is at the core of RSA. The result is converted back to a sequence of bytes, and that's the signature.

All these operations are deterministic, there is no random, hence it is normal and expected that signing the same data with the same key and the same hash function will yield the same signature ever and ever.

However there is a slight underspecification in the ASN.1-based fixed header. This is a structure which identifies the hash function, along with "parameters" for that hash function. Usual hash functions take no parameters, hence the parameters shall be represented with either a special "NULL" value (which takes a few bytes), or be omitted altogether: both representations are acceptable (although the former is supposedly preferred). So, the raw effect is that there are two versions of the "fixed header", for a given hash function. OpenSSL and Bouncycastle do not use the same header. However, signature verifiers are supposed to accept both.

PKCS#1 also describes a newer padding scheme, called PSS, which is more complex but with a stronger security proof. PSS includes a bunch of random bytes, so you will get a distinct signature every time.

Thomas Pornin
  • 72,986
  • 14
  • 147
  • 189
  • Very insightful! Thanks Thomas. One doubt though: Is the "fixed header" the PKCS#1 header itself or the one with padding & fixed header? Does BC also use PKSC#1 header or is it different because you mentioned that the ASN.1-based header use different parameters. Or PKCS#1 just defines some headers and different libs provide different parameters for it? – user489152 May 04 '11 at 14:50
  • 1
    @user489152: the full padding is: 00 01 FF .. FF 00 30 .. 04 14 xx ..; the 'xx ..' part is the SHA-1 hash. The ASN.1 header is the one from '30' to '04 14', and it is the one which has two valid versions. The first part (from '00 01' to the 'FF 00') is called "type 1 padding" in PKCS#1 and is always the same (the number of 'FF' is adjusted to match the key size). – Thomas Pornin May 04 '11 at 15:32
  • Thanks. And does the new PSS scheme already provided by BC and OpenSSL libs in their newer versions? I have OpenSSL 1.0 versions and a hexdump on a signature shows the scheme you mentioned above. – user489152 May 04 '11 at 16:07
  • 1
    @user489152: this page says that both Bouncycastle and OpenSSL support RSA-PSS, and it shows how to do it with OpenSSL: http://work-now-dammit.blogspot.com/2010/04/rsa-pss-signing-with-openssl-and.html – Thomas Pornin May 04 '11 at 16:10
4

Signatures are not a privacy mechanism; it's not considered a problem if you can get the plaintext back out. If your message must be kept secret, then encrypt as well as sign.

Nevertheless, remember that RSA signatures are created using a signer's private key. Given such a signature, you can use the signer's public key to "undo" the RSA transform (raise the message's signature to e, mod n) and get out the SHA1 or other hash value that was provided as its input. You still can't undo the hash function to get the input plaintext corresponding to a signature that has become detached from its message.

RSA for encryption is a different matter. Padding methods for encryption here do include random data in order to defeat traffic analysis.

crazyscot
  • 11,819
  • 2
  • 39
  • 40
  • Thanks a lot for the clarification. Just out of curiosity, if hashes are one-way mechanisms and hashes for a particular input data are unique(well, although Wooble disagreed) can we not by some mapping mechanism match it with some inputs and retrieve the data? Also can you elaborate it for encryption with AES or RSA. Will the same text give same ciphertext with same IV and key each time? Then how is data getting hidden?? – user489152 May 04 '11 at 13:22
  • @user489152: It is possible to map hashes to their source, and such datasets (called rainbow tables) are often used to crack passwords. However, this technique is simply not viable in most situations. Since the amount of data you can supply to a hash is limitless, would end up with a table with 2^128 keys (for SHA-1), each of which would have effectively infinite entries! – David Grant May 04 '11 at 13:31
  • With AES - indeed with any sane symmetric cipher - with the same input and key, you are guaranteed to get the same output. This is why CBC mode exists, which adds the IVs to make the result look different next time. If you use the same IV you are going to get the same result - but that breaks the fundamental assumption on which IVs are based, so don't do that. – crazyscot May 04 '11 at 15:35
-2

This is why you add a salt/initialisation vector on top of your key. That way it shouldn't be possible to tell which records came from the same plaintext.

Justin Simon
  • 1,133
  • 8
  • 8