I've had a look at similar topics, but no solution I can find exactly compares to what I'm trying to achieve.
I have a cipher text that needs to undergo a simple letter substitution based on the frequency of each letter's occurrence in the text. I already have a function to normalise the text (lowercase, no none-letter characters, no , count letter occurrences and then get the relative frequency of each letter. The letter is the key in a dictionary, and the frequency is the value.
I also have the expected letter frequencies for A-Z in a separate dictionary (k=letter, v=frequency), but i'm a bit befuddled by what to do next.
What I think I need to do is to take the normalised cipher text, the expected letter freq dict [d1] and the cipher letter freq dict [d2] and iterate over them as follows (part psuedocode):
for word in text:
for item in word:
for k,v in d2.items():
if d2[v] == d1[v]:
replace any instance of d2[k] with d1[k] in text
decoded_text=open('decoded_text.txt', 'w')
decoded_text.write(str('the decoded text')
Here, I want to take text and say "if the value in d2 matches a value in d1, replace any instance of d2[k] with d1[k] in text".
I realise i must have made a fair few basic python logic errors there (I'm relatively new at Python), but am I on the right track?
Thanks in advance
Update:
Thank you for all the helpful suggestions. I decided to try Karl Knechtel's method, with a few alterations to fit in my code. However, i'm still having problems (entirely in my implementation)
I have made a decode function to take the ciphertext file in question. This calls the count function previously made, which returns a dictionary (letter:frequency as a float). This meant that the "make uppercase version" code wouldn't work, as k and v didn't were floats and couldn't take .upper as an attribute. So, calling this decode function returns the ciphertext letter frequencies, and then the ciphertext itself, still encoded.
def sorted_histogram(a_dict):
return [x[1] for x in sorted(a_dict.items(), key=itemgetter(1))]
def decode(filename):
text=open(filename).read()
cipher=text.lower()
cipher_dict=count(filename)
english_histogram = sorted_histogram(english_dict)
cipher_histogram = sorted_histogram(cipher_dict)
mapping = dict(zip(english_histogram, cipher_histogram)
translated = ''.join(
mapping.get(c, c)
for c in cipher
)
return translated