1

I'm trying to build a markov generator that takes an arbitrary length for the chain of words as a programming exercise, but I've found a bug I just can't quite seem to fix. When I run the markov function, I get list index out of range.

I get the feeling I'm overlooking something obvious, but I'm not sure what. The traceback says the error is on line 41, with words[-1] = nextWords[random.randint(0, len(nextWords)-1)].

Full code is below, sorry if the indenting is messed up.

#! /usr/bin/python

# To change this template, choose Tools | Templates
# and open the template in the editor.

import random

class Markov(object):
    def __init__(self, open_file):
        self.cache = {}
    self.open_file = open_file
    open_file.seek(0)
    self.wordlist = open_file.read().split()

def get_random_list(self, length):
    i = random.randint(0, len(self.wordlist) - (length - 1))
    result = self.wordlist[i:i + length]
    return result

def find_next_word(self, words):
    candidates = []
    for i in range(len(self.wordlist) - len(words)):
        if self.wordlist[i:i + len(words)] == words and self.wordlist[i+len(words)+1] not in candidates:
            candidates.append(self.wordlist[i+len(words)+1])
    return candidates

def markov(self, length=20, chainlength=2):
    gibberish = []
    words = self.get_random_list(chainlength)
    for i in range(len(words)-1):
        gibberish.append(words[i])
    while len(gibberish) < length:
        #find candidate for next word
        nextWords = self.find_next_word(words)
        gibberish.append(words[-1])
        for i in range(len(words)):
            try:
                words[i] = words[i+1]
            except:
                pass
        words[-1] = nextWords[random.randint(0, len(nextWords)-1)]
    return " ".join(gibberish)
TVarmy
  • 75
  • 1
  • 5

1 Answers1

7

If words is empty, then yes that will happen. Trying to access words[-1] in an empty array is just as invalid as words[0]. Add a check to see if len(words) == 0. The same logic holds for nextWords which in this code looks like it too could be empty.

Chris Eberle
  • 47,994
  • 12
  • 82
  • 119
  • 3
    You can just do `if not words:` rather than `if len(words) == 0:` because an empty container is considered false in Python. Actually, I would do it the other way 'round: `if words: words[-1] = random.choice(nextWords)` – kindall Aug 03 '11 at 17:31
  • Thanks! Found out I was accidentally doing some math wrong. I shouldn't have added one to self.wordlist[i+len(words)+1] after much debugging. It was jumping over a word each time. Boils down to noob confusion over counting from zero. `if self.wordlist[i:i + len(words)] == words and self.wordlist[i+len(words)+1] not in candidates:` – TVarmy Aug 03 '11 at 20:32
  • Cool. If you like the answer, make sure to mark it with the checkmark on the left. – Chris Eberle Aug 03 '11 at 20:33