I am trying to write a spellchecker using a linear search which takes Shakespeares full works and compares it to a 10,000 word dictionary. I want the code to output all words in Shakespeares full works which aren't in the dictionary. I have attached pictures of my current output as well as pictures of the output I am looking for. The code I currently have doesn't produce any errors however as seen from the current output displays all words in Shakespeares full works. Any help here is appreciated.
https://i.stack.imgur.com/Oc7BQ.jpg: Current Output
https://i.stack.imgur.com/Z1tsE.jpg: Output I'm looking for
import re
import time
start_time = time.time()
def LinearSearch(Target, Words):
#Linear search for target in words. Words need not be sorted.
for s in Words:
if s==Target:
return True
return False
# Gets the Dictionary.
Words = [s.strip("\n").lower() for s in open("10kWords.txt")]
# Gets ShakespearesFullWorks and Encodes it.
Input_File = open('ShakespeareFullWorks.txt', "r", encoding='utf-8')
lines = Input_File.readlines()
for x in lines:
if not LinearSearch(x, Words):
print (re.findall(r"[\w']+", x))
print ("--- %s seconds ---" % (time.time() - start_time))