0

So I've been trying to get better acquainted with crypto using python (specifically pycryptodome) and I've come across an interesting issue trying to decode a byte string into ascii. Please see code below:

from Crypto.Signature import PKCS1_v1_5
from Crypto.Hash import SHA
from Crypto.PublicKey import RSA
message = b'Something secret'

random_gen = Crypto.Random.new().read
print("Type of random_gen: {}".format(type(random_gen)))
private_key = RSA.generate(1024, random_gen) # private key
public_key = private_key.publickey() # public key

signer = PKCS1_v1_5.new(private_key) # signer which uses private key
verifier = PKCS1_v1_5.new(public_key) # verifier which uses public key

h = SHA.new(message) # hash of message
print("Hash: {}".format(h.hexdigest()))

signature = signer.sign(h) # sign hashed version of message
print("Signature type = {}".format(type(signature)))
print("Signature: {}".format(binascii.hexlify(signature).decode('ascii')))

In the very last line of the code why is it that I have to first hexlify() the signature which is of type <class 'bytes'> before decoding it into ascii so that I can read the signature? Why is it that if I do:

print("Signature: {}".format(signature.decode('ascii')))

I get the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x88 in position 2: ordinal not in range(128)

Thanks for the help.

DGav
  • 271
  • 3
  • 14
  • Because ASCII refers to the 7-bit US-ASCII codepage. It can't represent any byte whose value is above `0x7F`. – Panagiotis Kanavos Jul 18 '18 at 14:27
  • @PanagiotisKanavos - right but I thought that hexlify-ing something simply returns the hex representation of the byte string and does not change the string of bytes itself; therefore, if a byte is out of range to be decoded by ascii then why would converting it to hex work? – DGav Jul 18 '18 at 14:32

1 Answers1

4

signature is a sequence of bytes : each element is an integer between 0 and 255 included, if you attempt to decode it directly in ascii, values above 127 will throw an exception.

binascii.hexlify return a new sequence of bytes from its input : for each byte from the input, two bytes are returned in the ouput, which are codes of ascii characters that correspond to the hexadecimal representation of the input byte. So each byte of the output represent an ascii character either between '0' and '9' or between 'a' and 'f'. For example the input byte 128 produce the two characters "80" so the two bytes 56 and 48 (which are the ascii codes of the characters '8' and '0').

So binascii.hexlify produce the hexadecimal representation in ascii form of a binary input. decode('ascii') applied after binascii.hexlify does not change the content but produce an object of str type.

In python 3.5 and above you can simply use the hex method of a bytes object to obtain an str object containing its hexadecimal representation :

signature.hex()
Jérôme Migné
  • 244
  • 1
  • 5
  • Ohhh, very interesting. I did not realize `binascii.hexlify` returns a new sequence of bytes, and I will have to look closer at the method you described by which this happens. Thanks – DGav Jul 24 '18 at 20:07