Questions tagged [lstm]

Long short-term memory. A neural network (NN) architecture that contains recurrent NN blocks that can remember a value for an arbitrary length of time. A very popular building block for deep NN.

Long short-term memory neural networks (LSTMs) are a subset of recurrent neural networks. They can take time-series data and make predictions using knowledge of how the system is evolving.

A major benefit to LSTMs is their ability to store and utilize long-term information, not just what they are provided at a particular instance. For more information on LSTMs check out these links from colah's blog post and MachineLearningMastery.

6289 questions
22
votes
1 answer

Using Dropout with Keras and LSTM/GRU cell

In Keras you can specify a dropout layer like this: model.add(Dropout(0.5)) But with a GRU cell you can specify the dropout as a parameter in the constructor: model.add(GRU(units=512, return_sequences=True, dropout=0.5, …
BigBadMe
  • 1,754
  • 1
  • 19
  • 27
22
votes
4 answers

Use LSTM tutorial code to predict next word in a sentence?

I've been trying to understand the sample code with https://www.tensorflow.org/tutorials/recurrent which you can find at https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/ptb_word_lm.py (Using tensorflow 1.3.0.) I've summarized…
Darren Cook
  • 27,837
  • 13
  • 117
  • 217
22
votes
5 answers

Can someone explain to me the difference between activation and recurrent activation arguments passed in initialising keras lstm layer?

Can someone explain to me the difference between activation and recurrent activation arguments passed in initialising keras lstm layer? According to my understanding LSTM has 4 layers. Please explain what are th e default activation functions of…
Mayank Uniyal
  • 221
  • 1
  • 2
  • 4
21
votes
2 answers

Understanding input shape to PyTorch LSTM

This seems to be one of the most common questions about LSTMs in PyTorch, but I am still unable to figure out what should be the input shape to PyTorch LSTM. Even after following several posts (1, 2, 3) and trying out the solutions, it doesn't seem…
PinkBanter
  • 1,686
  • 5
  • 17
  • 38
21
votes
4 answers

How exactly does LSTMCell from TensorFlow operates?

I try to reproduce results generated by the LSTMCell from TensorFlow to be sure that I know what it does. Here is my TensorFlow code: num_units = 3 lstm = tf.nn.rnn_cell.LSTMCell(num_units = num_units) timesteps = 7 num_input = 4 X =…
Roman
  • 124,451
  • 167
  • 349
  • 456
21
votes
3 answers

CuDNNLSTM: Failed to call ThenRnnForward

I am facing an issue when trying to use CuDNNLSTM instead of keras.layers.LSTM. This is the error I am getting: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size,…
user3084192
  • 357
  • 1
  • 3
  • 7
21
votes
1 answer

Generating MNIST numbers using LSTM-CGAN in TensorFlow

Inspired by this article, I'm trying to build a Conditional GAN which will use LSTM to generate MNIST numbers. I hope I'm using the same architecture as in the image below (except for the bidirectional RNN in discriminator, taken from this…
21
votes
1 answer

LSTM time sequence generation using PyTorch

For several days now, I am trying to build a simple sine-wave sequence generation using LSTM, without any glimpse of success so far. I started from the time sequence prediction example All what I wanted to do differently is: Use different…
OSM
  • 419
  • 3
  • 10
21
votes
2 answers

How to interpret weights in a LSTM layer in Keras

I'm currently training a recurrent neural network for weather forecasting, using a LSTM layer. The network itself is pretty simple and looks roughly like this: model = Sequential() model.add(LSTM(hidden_neurons, input_shape=(time_steps,…
Isa
  • 1,121
  • 3
  • 10
  • 17
21
votes
2 answers

Keras : How should I prepare input data for RNN?

I'm having trouble with preparing input data for RNN on Keras. Currently, my training data dimension is: (6752, 600, 13) 6752: number of training data 600: number of time steps 13: size of feature vectors (the vector is in float) X_train and…
totuta
  • 383
  • 1
  • 3
  • 8
21
votes
3 answers

How to train a RNN with LSTM cells for time series prediction

I'm currently trying to build a simple model for predicting time series. The goal would be to train the model with a sequence so that the model is able to predict future values. I'm using tensorflow and lstm cells to do so. The model is trained with…
Jakob
  • 369
  • 1
  • 3
  • 11
21
votes
4 answers

How to implement a deep bidirectional LSTM with Keras?

I am trying to implement a LSTM based speech recognizer. So far I could set up bidirectional LSTM (i think it is working as a bidirectional LSTM) by following the example in Merge layer. Now I want to try it with another bidirectional LSTM layer,…
udani
  • 1,243
  • 2
  • 11
  • 33
20
votes
4 answers

Stateful LSTM and stream predictions

I've trained an LSTM model (built with Keras and TF) on multiple batches of 7 samples with 3 features each, with a shape the like below sample (numbers below are just placeholders for the purpose of explanation), each batch is labeled 0 or…
Shlomi Schwartz
  • 8,693
  • 29
  • 109
  • 186
20
votes
1 answer

Keras LSTM input dimension setting

I was trying to train a LSTM model using keras but I think I got something wrong here. I got an error of ValueError: Error when checking input: expected lstm_17_input to have 3 dimensions, but got array with shape (10000, 0, 20) while my code…
Mr.cysl
  • 1,494
  • 6
  • 23
  • 37
20
votes
2 answers

How to implement Tensorflow batch normalization in LSTM

My current LSTM network looks like this. rnn_cell = tf.contrib.rnn.BasicRNNCell(num_units=CELL_SIZE) init_s = rnn_cell.zero_state(batch_size=1, dtype=tf.float32) # very first hidden state outputs, final_s = tf.nn.dynamic_rnn( rnn_cell, …