16

I roughly followed this tutorial:

https://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/

A notable difference is that I use 2 LSTM layers with dropout. My data set is different (music data-set in abc notation). I do get some songs generated, but after a certain number of steps (may range from 30 steps to a couple hundred) in the generation process, the LSTM keeps generating the exact same sequence over and over again. For example, it once got stuck with generating URLs for songs:

F: http://www.youtube.com/watch?v=JPtqU6pipQI

and so on ...

It also once got stuck with generating the same two songs (the two songs are a sequence of about 300 characters). In the beginning it generated 3-4 good pieces but afterwards, it kept regenerating the two songs almost indefinitely.

I am wondering, does anyone have some insight into what could be happening ?

I want to clarify that any sequence generated whether repeating or non-repeating seems to be new (model is not memorising). The validation loss and training loss decrease as expected. Andrej Karpathy is able to generate a document of thousands of characters and I couldn't find this pattern of getting stuck indefinitely.

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

  • Try using a `stateful` mode in order to connect consecutive generations. – Marcin Możejko Nov 05 '17 at 21:24
  • hmm I was trying to avoid that, but I'll try it. Thanks for the suggestion :) – oneThousandHertz Nov 05 '17 at 22:42
  • 1
    I found that increasing the sample length (sub-batch) that make up my long sequences made a big difference, without having to use *stateful*. A simple fix that might be worth trying. – Phil Feb 08 '18 at 11:41
  • @Phil could you say more about your work and this observation? What was the application domain, and how long were your input sequences before and then afterwards? Was your model generating the same outputs before you increased the "look back" size of your inputs? I'm curious because I'm facing the same problem now. @MarcinMożejko could you say more about why setting `stateful` to True helps prevent the model from memorizing the inputs and cycling back through seen values? – duhaime Oct 25 '18 at 01:35
  • 1
    @duhaime I haven't touched the project in a while but here is what I can tell you from memory. The domain was generation of up to 16 parameters at each step of the timeseries (to feed a vocoder for speech generation). That said, I think I did eventually manage to debug stateful use, but it didn't really help me. The project description is here if you are interested: http://babble-rnn.consected.com/ – Phil Oct 29 '18 at 11:43
  • Thanks @Phil. I can say that I've since also found that increasing the sample length also helped my model break from the same outputs considerably – duhaime Oct 29 '18 at 17:34
  • Hey what does sample length means? I have the sequence length of 100. Anything should I change? I am facing the same issue as you do! I am getting same repetitive content as you do. Any help what I can do ? – JustABeginner Jun 30 '19 at 08:14

2 Answers2

4

Instead of taking the argmax on the prediction output, try introducing some randomness with something like this:

np.argmax(prediction_output)[0])

to

np.random.choice(len(prediction_output), p=prediction_output)

I've been struggling on this repeating sequences issue for a while until I discovered this Colab notebook where I figured out why their model was able to generate some really good samples: https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/shakespeare_with_tpu_and_keras.ipynb#scrollTo=tU7M-EGGxR3E

After I changed this single line, my model went from generating a few words over and over to something actually interesting!

Shane Smiskol
  • 952
  • 1
  • 11
  • 38
4

To use and train a text generation model follow these steps:

  1. Drawing from the model a probability distribution over the next character given the text available so far ( This would be our predictions scores )
  2. Reweighting the distribution to a certain "temperature" (See the code below)
  3. Sampling the next character at random according to the reweighted distribution (See the code below)
  4. Adding the new character at the end of the available text

See the sample function:

def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

You should use the sample function during training as follows:

for epoch in range(1, 60):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

A low temperature results in extremely repetitive and predictable text, but where local structure is highly realistic: in particular, all words (a word being a local pattern of characters) are real English words. With higher temperatures, the generated text becomes more interesting, surprising, even creative.

See this notebook

Guillem
  • 2,376
  • 2
  • 18
  • 35