0

Hello: Just a quick question.. I hope. I am trying to use this program to generate random text from a corpus.. in this case a portion of a book.

I have a text file that is my corpus: (This is the intro, won't post the whole thing here)

The Project Gutenberg EBook of My Man Jeeves, by P. G. Wodehouse
#27 in our series by P. G. Wodehouse

Copyright laws are changing all over the world. Be sure to check the
copyright laws for your country before downloading or redistributing
this or any other Project Gutenberg eBook.

This header should be the first thing seen when viewing this Project
Gutenberg file.  Please do not remove it.  Do not change or edit the
header without written permission.

Please read the "legal small print," and other information about the
eBook and Project Gutenberg at the bottom of this file.  Included is
important information about your specific rights and restrictions in
how the file may be used.  You can also find out about how to make a
donation to Project Gutenberg, and how to get involved.

etc etc etc

Next I have the class I am trying to use:

import random

class Markov(object):

    def __init__(self, open_file):
        self.cache = {}
        self.open_file = open_file
        self.words = self.file_to_words()
        self.word_size = len(self.words)
        self.database()


def file_to_words(self):
    self.open_file.seek(0)
    data = self.open_file.read()
    words = data.split()
    return words


def triples(self):
    """ Generates triples from the given data string. So if our string were
            "What a lovely day", we'd generate (What, a, lovely) and then
            (a, lovely, day).
    """

    if len(self.words) < 3:
        return

    for i in range(len(self.words) - 2):
        yield (self.words[i], self.words[i+1], self.words[i+2])

def database(self):
    for w1, w2, w3 in self.triples():
        key = (w1, w2)
        if key in self.cache:
            self.cache[key].append(w3)
        else:
            self.cache[key] = [w3]

def generate_markov_text(self, size=25):
    seed = random.randint(0, self.word_size-3)
    seed_word, next_word = self.words[seed], self.words[seed+1]
    w1, w2 = seed_word, next_word
    gen_words = []
    for i in xrange(size):
        gen_words.append(w1)
        w1, w2 = w2, random.choice(self.cache[(w1, w2)])
    gen_words.append(w2)
    return ' '.join(gen_words)

And finally the main that gives the error: "'Markov' object has no attribute 'file_to_words'"

import Class
file_ = open('derp.txt')
markov = Class.Markov(file_)
markov.generate_markov_text()

What is going wrong here? Thanks.

Are you Shure
  • 85
  • 2
  • 7
  • 2
    You file_to_words is not indented to make it a part of the Markov class. It is a naked function. – Keith Nov 24 '12 at 03:09

2 Answers2

2

You need to indent the file_to_words method so that it is part of the Markov class. The way you have it at the moment it is a module level function in the Class function. Move everything in the file_to_words method (including the def line) 4 spaces to the right.

Update: The same goes for all the other methods too. Python uses whitespace/indentation to denote scope.

Whatang
  • 9,938
  • 2
  • 22
  • 24
  • Last question can you explain why now that I have the program running it won't generate an output of words? I am basically just taking the program here and trying to run it for fun, but I am unable to match his output?? http://agiliq.com/blog/2009/06/generating-pseudo-random-text-with-markov-chains-u/ – Are you Shure Nov 24 '12 at 03:31
1

From the code you posted, all the methods except init do not belong to the Markov class because of the indentation.

Rainfield
  • 1,172
  • 2
  • 14
  • 29