I want to break XOR repeated key, I dont now anything about the key nor the message, only thing I know that it is using repeated key. Encoded message s beenbase64'd after being encrypted with repeating-key XOR so I converted base 64 to base16 first so it is easier. I have instructions but I don't understand this very good.
Let KEYSIZE be the guessed length of the key; try values from 2 to (say) 40. Write a function to compute the edit distance/Hamming distance between two strings.
For each KEYSIZE, take the first KEYSIZE worth of bytes, and the second KEYSIZE worth of bytes, and find the edit distance between them. Normalize this result by dividing by KEYSIZE.
The KEYSIZE with the smallest normalized edit distance is probably the key. You could proceed perhaps with the smallest 2-3 KEYSIZE values. Or take 4 KEYSIZE blocks instead of 2 and average the distances.
Now that you probably know the KEYSIZE: break the ciphertext into blocks of KEYSIZE length, etc, I got this and the rest fine, for now, I should now exactly when I found out if this is good and try to decode..
I wrote a code for this in Python, it is working, but I am not completely sure if I have done this correctly
def compute_distance(str1,str2,keysize):
count=0
str1=str1.replace("\n", "")
str2=str2.replace("\n", "")
keysize=str(keysize*8)
sbin1=format(int(str1,16),'0'+keysize+'b')
sbin2=format(int(str2,16),'0'+keysize+'b')
for c1,c2 in zip(sbin1, sbin2):
if c1!=c2:
count+=1
return count
def keysize_dist(filelocation):
f=open(filelocation,'r')
lines=[]
for line in f.readlines():
line=line.strip('\n')
lines.append(line)
lines=''.join(lines).strip('\n')
normalized=[]
for keysize in range(2,40):
count=compute_distance(lines[0:keysize*2],lines[keysize*2:keysize*4],keysize)
normalized.append(float(count)/keysize)
return lines,int(min(normalized))