23

Possible Duplicate:
bitwise XOR of hex numbers in python

I am trying to XOR two hex strings in Python and did not really know where to start from.

I have two hex strings:

a = "32510ba9a7b2bba9b8005d43a304b5714cc0bb0c8a34884dd91304b8ad40b62b07df44ba6e9d8a2368e51d04e0e7b207b70b9b8261112bacb6c866a232dfe257527dc29398f5f3251a0d47e503c66e935de81230b59b7afb5f41afa8d661cb"
b = "32510ba9babebbbefd001547a810e67149caee11d945cd7fc81a05e9f85aac650e9052ba6a8cd8257bf14d13e6f0a803b54fde9e77472dbff89d71b57bddef121336cb85ccb8f3315f4b52e301d16e9f52f90"

Should I be using this ?

  1. return "".join([chr((x) ^ (y)) for (x,y) in zip(a[:len(b)], b)])
  2. return "".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a[:len(b)], b)])

I don't understand the difference with the two codes above. Why chr and ord? I have also seen people using int(hex,16).

Community
  • 1
  • 1
Kok Leong Fong
  • 241
  • 1
  • 2
  • 5

2 Answers2

34

You are missing a couple of things here.

First, you will not want to XOR those strings. You have the strings in an encoded form, therefore, you need to .decode() them first:

binary_a = a.decode("hex")
binary_b = b.decode("hex")

Then, as already mentioned, the zip() function stops iterating as soon as one of the two sequences is exhausted. No slicing is needed.

You need the second version of the loop: First, you want to get the ASCII value of the characters: ord() produces a number. This is necessary because ^ only works on numbers.

After XORing the numbers, you then convert the number back into a character with chr:

def xor_strings(xs, ys):
    return "".join(chr(ord(x) ^ ord(y)) for x, y in zip(xs, ys))

xored = xor_strings(binary_a, binary_b).encode("hex")

Using .encode() at the end, we get the binary string back into a form, that prints nicely.

phant0m
  • 16,595
  • 5
  • 50
  • 82
  • 4
    This doesn't work on Python 3, where `str` objects don't have a `decode` method (and where `"hex"` is not a recognized encoding anyway). – Blckknght Jan 25 '13 at 17:25
  • if the data in the strings is a number in hex, then using ord won't give you what you need, `ord("f") != 15`, just use `int("f")` as glyphobet said – Facundo Casco Jan 25 '13 at 17:45
  • @F.C. That's why you do `decode()` first ;) – phant0m Jan 25 '13 at 18:10
  • 2
    @Blckknght Python 3 wasn't specified in the tags, so it was safe to assume it wasn't needed. If you already know it won't work, why not point out the correct functions to use on Python 3? – phant0m Jan 25 '13 at 18:15
  • 2
    I'm not sure if there is an easy way to do byte-by-byte decoding of a hex string to a byte string in Python 3. I didn't find one in my brief checking, and I think @glyphobet's solution to convert to integers rather than byte strings is better in general if you want to do math operations (like XOR) on the value. – Blckknght Jan 25 '13 at 18:30
20

int('', 16) converts a hex string to an integer using base 16:

>>> int('f', 16)
15 
>>> int('10', 16)
16

So do this:

result = int(a, 16) ^ int(b, 16) # convert to integers and xor them together
return '{:x}'.format(result)     # convert back to hexadecimal
glyphobet
  • 1,564
  • 11
  • 17
  • 3
    This is a good solution, as it works on all of the hex characters at once (giving a single integer) rather than character by character or byte by byte. Long hex strings will end up creating very large integers (which will use the `long` type in Python 2), but this is likely to be invisible to code that uses them. You can avoid needing to slice off the `0x` bit by using string formatting, rather than the `hex` builtin: `"{:x}".format(16045690984833335023)` returns `'deadbeefdeadbeef'` – Blckknght Jan 25 '13 at 17:25
  • Good point! I've updated the code to use `{:x}.format()` instead of `hex()`. Thanks. – glyphobet Jan 25 '13 at 19:50
  • @Blckknght nice ans. I just wanted to know how '{:x}.format()' works . Thx. – Raman Singh Jan 24 '15 at 16:39