0

I'm trying to write an implementation of SHA-256 in python 3. My version is supposed to take in a hexadecimal encoding and output the corresponding hash value. I've used https://en.wikipedia.org/wiki/SHA-2#Pseudocode as guide.

My function works well for most inputs but sometimes it gives an output that is only 63bits (instead of 64). My function uses 32bit binary strings.

I think I have found the problem, in the last step of the algorithm the binary addition

h4 := h4 + e (or another h-vector and corresponding letter)

yields a binary number that is too small. The last thing I do is to use hex() and I should get a string of 8 characters. In this example I only get 7.

out4 = hex(int(h4,2))[2:]

One problematic input is e5e5e5 It gives "10110101111110101011010101101100" for h4 and "01010001000011100101001001111111" for e so the addition gives "00000111000010010000011111101011" and out4 = 70907eb.

What should I do in these cases?

xaxablyat
  • 27
  • 1
  • 7
  • Are you doing this as a learning exercise? For production code you should _**never** implement your own cryptographic primitives_. Always _always_ use a tried and true, widely-used, open-source library. – ChrisGPT was on strike Nov 30 '19 at 14:53
  • Not sure I fully understand your case. But *maybe simply left-pad by zeros*? For example, if you get `70907eb` then you can pad it to be `070907eb`. – Yaniv Nov 30 '19 at 14:54
  • The implementation is only as a learning exercise! – xaxablyat Nov 30 '19 at 15:03

1 Answers1

1

I should get a string of 8 characters

Why do you think so? hex doesn't allow to specify the length of the output to begin with, so, for example, if the correct output is 8 bytes of zeros, hex will return 0x0 - the shortest representation possible.

I'm guessing the correct output should begin with zero, but hex is cutting it off. Use format strings to specify the length of output:

In [1]: f'{0:08x}'                                                             
Out[1]: '00000000'  # lowercase hexadecimal (x) digits that must fit into at least 8 characters, prefixed with zero (08) as needed
ForceBru
  • 43,482
  • 10
  • 63
  • 98
  • I think it should be so because I know that the total output is supposed to always be 64 characters long and I get it as a sum of 8 different vectors. I have tried experimenting a bit with format but I don't really understand it. My output vector is given like this out4 = hex(int(h4,2))[2:] Should I put the format like this out4 = hex(int(h4,2))[2:]f'{0:08x}' ? – xaxablyat Nov 30 '19 at 15:00
  • @xaxablyat, no, please see [this](https://realpython.com/python-f-strings) about string formatting and f-strings – ForceBru Nov 30 '19 at 15:02
  • Thanks fo the help! – xaxablyat Nov 30 '19 at 15:28