0

I get an index -1 is out of bounds for axis 0 with size 0 error from scipy when trying to implement a text generator with ngrams.

Traceback (most recent call last):
  File "C:\Users\hp\PycharmProjects\N-gram poems\trigram_model.py", line 125, in <module>
    generate()
  File "C:\Users\hp\PycharmProjects\N-gram poems\trigram_model.py", line 118, in generate
    singleverse(int(c))
  File "C:\Users\hp\PycharmProjects\N-gram poems\trigram_model.py", line 80, in singleverse
    result = stats.multinomial.rvs(1, word_probabilities)
  File "C:\Users\hp\PycharmProjects\N-gram poems\venv\lib\site-packages\scipy\stats\_multivariate.py", line 3242, in rvs
    n, p, npcond = self._process_parameters(n, p)
  File "C:\Users\hp\PycharmProjects\N-gram poems\venv\lib\site-packages\scipy\stats\_multivariate.py", line 3036, in _process_parameters
    p[..., -1] = 1. - p[..., :-1].sum(axis=-1)
IndexError: index -1 is out of bounds for axis 0 with size 0

It's in a for loop and when the error occurs changes each time. Some times it does not occur at all. It mostly occur close to the end of the program.

This is the code where the error occurs:

def singleverse(num):
    TrTrigrams = [((filtered_tokens[i], filtered_tokens[i + 1]), filtered_tokens[i + 2]) for i in
                  range(len(filtered_tokens) - 2)]
    TrTrigramCFD = nltk.ConditionalFreqDist(TrTrigrams)
    TrTrigramPbs = nltk.ConditionalProbDist(TrTrigramCFD, nltk.MLEProbDist)

    rand = random.choice(random_choice_list)
    start_word = ('<s>', rand)
    data = []
    for i in range(10):
        probable_words = list(TrTrigramPbs[start_word].samples())
        word_probabilities = [TrTrigramPbs[start_word].prob(word) for word in probable_words]
        result = stats.multinomial.rvs(1, word_probabilities)
        index_of_probable_word = list(result).index(1)
        start_word = (start_word[1], (probable_words[index_of_probable_word]))
        data.append(start_word[1])
    line = []
    for i in data:
        if i != "<s>" and i != "</s>":
            line.append(i)
    poem_line = ' '.join([str(i) for i in line]).capitalize()
    print(poem_line)


def generate():
    """Generates the final poem with user input of structure."""
    print("What structure do you want?(e.g., 3 x 4, 2 x 4, 2 x 5): ")
    while True:
        try:
            x, y, z = input().split()
        except:
            print("Enter the structure as shown above.")
            continue
        break

    while True:
        try:
            for stanza in range(1):
                for first_verse in range(1):
                    b = random.randint(7, 12)
                    firstverse(int(b))
                for verse in range(int(z) - 1):
                    a = random.randint(7, 12)
                    singleverse(int(a))
                print('\n')
            for stanza in range(int(x) - 1):
                for verse in range(int(z)):
                    c = random.randint(7, 12)
                    singleverse(int(c))
                print('\n')
        except KeyError:
            print("This was not a valid seed word please try again.")
            continue
        break

generate()
Luci
  • 1
  • 1
  • It doesn't appear that the traceback refers to *any* of the lines of code that you posted. For example, in the traceback, line 125 contains `generate()`, but `generate()` isn't in the code snippet that you posted. – jjramsey Jun 14 '21 at 13:11
  • I included some more code. – Luci Jun 14 '21 at 13:19
  • I can reproduce your error message by running `scipy.stats.multinomial.rvs(1,[])`. This would indicate that the list `word_probabilities` on line 80 is empty. If you can figure out why it's empty, then you'll probably have solved your problem. – jjramsey Jun 14 '21 at 14:28
  • 1
    I think it is because of the random.choice(random_choice_list). When I replace it with a word the code runs fine. Do you have any ideas on how can I implement a code that will randomly choose a word after '' in start_word. – Luci Jun 14 '21 at 14:36
  • I doubt the problem is with `random.choice(random_choice_list)`. Is `start_word` supposed to be a tuple or a string? – jjramsey Jun 14 '21 at 15:28
  • It is supposed to be a tuple. – Luci Jun 14 '21 at 15:46
  • 1
    What do you get if you print `start_word` and `TrTrigramCFD.conditions()`? Is `start_word` in the list of conditions? – jjramsey Jun 14 '21 at 17:27

0 Answers0