Python Spell Checker Using Linear Search

Question

I am trying to write a spellchecker using a linear search which takes Shakespeares full works and compares it to a 10,000 word dictionary. I want the code to output all words in Shakespeares full works which aren't in the dictionary. I have attached pictures of my current output as well as pictures of the output I am looking for. The code I currently have doesn't produce any errors however as seen from the current output displays all words in Shakespeares full works. Any help here is appreciated.

https://i.stack.imgur.com/Oc7BQ.jpg: Current Output

https://i.stack.imgur.com/Z1tsE.jpg: Output I'm looking for

import re
import time
start_time = time.time()

def LinearSearch(Target, Words):
#Linear search for target in words. Words need not be sorted.
    for s in Words:
        if s==Target:
            return True
        return False

# Gets the Dictionary.
Words = [s.strip("\n").lower() for s in open("10kWords.txt")]

# Gets ShakespearesFullWorks and Encodes it.
Input_File = open('ShakespeareFullWorks.txt', "r", encoding='utf-8')
lines = Input_File.readlines()
for x in lines:
    if not LinearSearch(x, Words):
        print (re.findall(r"[\w']+", x))

print ("--- %s seconds ---" % (time.time() - start_time))

Arndt, the output is the entirety of Shakespeares full works. Far far too large to post the entire output into this question which is why I have added a small photo showing the output. — Joe Bloggs, Apr 14 '18 at 11:19
Whatever it is you're showing in the photo, paste it as text instead. — Arndt Jonasson, Apr 14 '18 at 12:31

score 1 · Accepted Answer · answered Apr 14 '18 at 11:31

1

The problem is that x in LinearSearch(x, Words) is not a word but rather a line. So every line is printed because a line will likely not match a word. You need to do:

for line in lines:
    for word in re.findall(r"[\w']+", line):
        if not LinearSearch(word, Words):
            print(word)

That is assuming that re.findall(r"[\w']+", x) returns a list of the words in x.

answered Apr 14 '18 at 11:31

Dan D.

73,243
15
104
123

Thank you very much Dan! This is exactly what I was looking for, I appreciate the explanation as well, makes me realise where the mistakes where. Thanks again :)! – Joe Bloggs Apr 14 '18 at 11:36

Python Spell Checker Using Linear Search

1 Answers1