12

I'm looking for a way to make a function in python where you pass in a string and it returns whether it's spelled correctly. I don't want to check against a dictionary. Instead, I want it to check Google's spelling suggestions. That way, celebrity names and other various proper nouns will count as being spelled correctly.

Here's where I'm at so far. It works most of the time, but it messes up with some celebrity names. For example, things like "cee lo green" or "posner" get marked as incorrect.

import httplib
import xml.dom.minidom

data = """
<spellrequest textalreadyclipped="0" ignoredups="0" ignoredigits="1" ignoreallcaps="1">
<text> %s </text>
</spellrequest>
"""

def spellCheck(word_to_spell):

    con = httplib.HTTPSConnection("www.google.com")
    con.request("POST", "/tbproxy/spell?lang=en", data % word_to_spell)
    response = con.getresponse()

    dom = xml.dom.minidom.parseString(response.read())
    dom_data = dom.getElementsByTagName('spellresult')[0]

    if dom_data.childNodes:
        for child_node in dom_data.childNodes:
            result = child_node.firstChild.data.split()
        for word in result:
            if word_to_spell.upper() == word.upper():
                return True;
        return False;
    else:
        return True;
Sean Gransee
  • 281
  • 1
  • 3
  • 9
  • 11
    Be careful: [5.3 You agree not to access (or attempt to access) any of the Services by any means other than through the interface that is provided by Google, unless you have been specifically allowed to do so in a separate agreement with Google. You specifically agree not to access (or attempt to access) any of the Services through any automated means (including use of scripts or web crawlers) and shall ensure that you comply with the instructions set out in any robots.txt file present on the Services.](http://www.google.com/accounts/TOS) – sarnold Dec 08 '11 at 09:30
  • You don't seem to iterate correctly over `result`. – eumiro Dec 08 '11 at 09:32
  • https://bitbucket.org/mchaput/whoosh/wiki/Home – Surya Sep 02 '12 at 17:05

2 Answers2

11

Peter Norvig tells you how implement spell checker in Python.

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • but doesn't that just check against a dictionary in a text file? – Sean Gransee Dec 08 '11 at 09:36
  • Yes I did. It doesn't go out to Google and check if words are correct, it just looks in a text file you specify. – Sean Gransee Dec 08 '11 at 09:51
  • 4
    You didn't understand it....did the statistics mean nothing to you? No, it's not going out to Google. I'm suggesting that your way is quite incorrect; this would be a better way to go. – duffymo Dec 08 '11 at 09:57
10

Rather than sticking to Mr. Google, try out other big fellows.

  1. If you really want to stick with search engines which count page requests, Yahoo and Bing are providing some excellent features. Yahoo is directly providing spell checking services using YQL tables (Free: 5000 request/day and non-commercial).

  2. You have good number of Python API's which are capable to do a lot similar magic including on nouns that you mentioned (sometimes may turn around - after all its somewhere based upon probability)

So, in the second case, you got a good list (totally free)

  1. GNU - Aspell (Even got python bindings)
  2. PyEnchant
  3. Whoosh (It does a lot more than spell checking but I think it has some edge on it.)

I hope they should give you a clear idea of how things work.

Actually spell checking involves very complex mechanisms in the areas of Machine learning, AI, NLP.. etc a lot more. So, companies like Google/ Yahoo don't really offer their API entirely free.

Surya
  • 4,824
  • 6
  • 38
  • 63
  • What does "No one is going to give them for free & open source" mean? You list several free and open source examples in your answer. – Michael Hoffman Sep 04 '12 at 01:44
  • @MichaelHoffman I was actually referring to more sophisticated API's like Yahoo Spell Checking or Google Prediction API.. – Surya Sep 04 '12 at 12:43