-2

This was made to brute force caesar ciphers using a dictionary file from http://www.math.sjsu.edu/~foster/dictionary.txt. It is run through three functions, lang_lib() which makes the text of the dictionary into a callable object, isEnglish(), which checks the percentage of the phrase, and if at least 60% of it matchwa with the any words in the dictionary, it would return a True value. Using this, a caeser cipher function runs through all shifts, and checking them from english words. It should return the result with the highest percentage, but it only seems to work through shifts 1-18. I can't figure out why it isn't working.

def lang_lib():
    file = open('dictionary.txt', 'r')
    file_read = file.read()
    file_split = file_read.split()
    words = []
    for word in file_split:
        words.append(word)
    file.close()
    return words

dictionary = lang_lib()

def isEnglish(text):
    split_text = text.lower().split()
    counter = 0
    not_in_dict = []
    for word in split_text:
        if word in dictionary:
            counter += 1
        else:
            not_in_dict.append(word)

    length = len(split_text)
    text_percent = ((counter / length) * 100)
    #print(text_percent)
    if text_percent >= 60.0:
        return True
    else:
        return False

alphabet = "abcdefghijklmnopqrstuvwxyz0123456789!@#$%/."

def caeser(text): #Put in text, and it will spit out all possible values
    lower_text = text.lower()
    ciphertext = "" #stores current cipher value
    matches = [] #stores possible matches

    for i in range(len(alphabet)): #loops for the length of input alphabet
        for c in lower_text:
            if c in alphabet:
                num = alphabet.find(c)
                newnum = num - i
                if newnum >= len(alphabet):
                    newnum -= len(alphabet)
                elif newnum < 0:
                    newnum += len(alphabet)
                ciphertext = ciphertext + alphabet[newnum]
            else:
                ciphertext = ciphertext + c

            testing = isEnglish(ciphertext)
            for text in ciphertext:
                if testing == True and len(ciphertext) == len(lower_text):
                    matches.append(ciphertext)
                    return i, matches

        ciphertext = "" #clears ciphertext so it doesn't get cluttered

print(caeser('0x447 #0x$x 74w v0%5')) #shift of 19
print(caeser('zw336 @zw9w 63v uz#4')) #shift of 18

Thanks guys.

ZeZekeZ
  • 11
  • 1
  • `return i, matches` exits the function in the first iteration of `for text in ciphertext` where the `if` condition holds. It also vaguely looks like maybe you have an indentation error. Please try to reduce this to a [mre] – tripleee Jan 29 '20 at 04:28
  • @tripleee it is supposed to exit there – ZeZekeZ Jan 29 '20 at 04:38
  • Try adding print(text) at the top of isEnglish() ... a) you're checking after every character which is making the code slow, b) you'll notice your cipher text is flawed and shift 19 is "hello theue old chvm" – Mike Guelfi Jan 29 '20 at 04:54
  • @MikeGuelfi How do I make it pass individual words instead of letters? – ZeZekeZ Jan 29 '20 at 05:06
  • Posted as an answer simply for formatting reasons... – Mike Guelfi Jan 29 '20 at 05:12
  • Variable and function names should follow the `lower_case_with_underscores` style. – AMC Jan 29 '20 at 05:19

2 Answers2

0

This part is indented too far as @tripleee suggested:

testing = isEnglish(ciphertext)
for text in ciphertext:
   if testing == True:
        matches.append(ciphertext)
        return i, matches

Also you don't need to check the length if you have the indentation right and let the previous loop complete....

Mike Guelfi
  • 132
  • 4
0

I found out that the dictionary.txt does not contain 2 or 3 letter words, so it would skew long inputs with many of these words, and return False. I added a list of common words, so now all inputs work accurately.

If anyone wants to help me make this code more efficient, I'd love some pointers. I am very new to Python.

ZeZekeZ
  • 11
  • 1