Questions tagged [lstm]

Long short-term memory. A neural network (NN) architecture that contains recurrent NN blocks that can remember a value for an arbitrary length of time. A very popular building block for deep NN.

Long short-term memory neural networks (LSTMs) are a subset of recurrent neural networks. They can take time-series data and make predictions using knowledge of how the system is evolving.

A major benefit to LSTMs is their ability to store and utilize long-term information, not just what they are provided at a particular instance. For more information on LSTMs check out these links from colah's blog post and MachineLearningMastery.

6289 questions
20
votes
4 answers

ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel

This it the code: X = tf.placeholder(tf.float32, [batch_size, seq_len_1, 1], name='X') labels = tf.placeholder(tf.float32, [None, alpha_size], name='labels') rnn_cell = tf.contrib.rnn.BasicLSTMCell(512) m_rnn_cell =…
ParmuTownley
  • 957
  • 2
  • 14
  • 34
19
votes
1 answer

How to connect LSTM layers in Keras, RepeatVector or return_sequence=True?

I'm trying to develop an Encoder model in keras for timeseries. The shape of data is (5039, 28, 1), meaning that my seq_len is 28 and I have one feature. For the first layer of the encoder, I'm using 112 hunits, second layer will have 56 and to be…
Birish
  • 5,514
  • 5
  • 32
  • 51
19
votes
1 answer

Tensorflow: Attempting to use uninitialized value beta1_power

I got the following error when I try to run the code at the end of the post. But it is not clear to me what is wrong with my code. Could anybody let me know the tricks in debugging a tensorflow program? $ ./main.py Extracting…
user1424739
  • 11,937
  • 17
  • 63
  • 152
19
votes
3 answers

How to get reproducible result when running Keras with Tensorflow backend

Every time I run LSTM network with Keras in jupyter notebook, I got a different result, and I have googled a lot, and I have tried some different solutions, but none of they are work, here are some solutions I tried: set numpy random…
176coding
  • 2,933
  • 4
  • 17
  • 18
19
votes
1 answer

Understanding stateful LSTM

I'm going through this tutorial on RNNs/LSTMs and I'm having quite a hard time understanding stateful LSTMs. My questions are as follows : 1. Training batching size In the Keras docs on RNNs, I found out that the hidden state of the sample in i-th…
H.M.
  • 261
  • 3
  • 8
18
votes
2 answers

lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU

I am running the following code for LSTM on Databricks with GPU model = Sequential() model.add(LSTM(64, activation=LeakyReLU(alpha=0.05), batch_input_shape=(1, timesteps, n_features), stateful=False, return_sequences =…
18
votes
1 answer

ValueError: Data cardinality is ambiguous

I'm trying to train LSTM network on data taken from a DataFrame. Here's the code: x_lstm=x.to_numpy().reshape(1,x.shape[0],x.shape[1]) model = keras.models.Sequential([ keras.layers.LSTM(x.shape[1], return_sequences=True,…
Arsen Zahray
  • 24,367
  • 48
  • 131
  • 224
18
votes
9 answers

CuDNNLSTM: UnknownError: Fail to find the dnn implementation

I have run the model with LSTM as the first layer successfully. But out of curiosity, I replace LSTM with CuDNNLSTM. But after model.fit, it replied the following error message: UnknownError: Fail to find the dnn implementation. [[{{node…
Fay Wang
  • 191
  • 1
  • 1
  • 4
18
votes
1 answer

Keras lstm with masking layer for variable-length inputs

I know this is a subject with a lot of questions but I couldn't find any solution to my problem. I am training a LSTM network on variable-length inputs using a masking layer but it seems that it doesn't have any effect. Input shape (100, 362, 24)…
Florian Mutel
  • 1,044
  • 1
  • 6
  • 13
18
votes
3 answers

How to use additional features along with word embeddings in Keras ?

I am training a LSTM model with Keras on the dataset which looks like following. The variable "Description" is a text field and "Age" and "Gender" are categorical and continuous fields. Age, Gender, Description 22, M, "purchased a phone" 35, F,…
userxxx
  • 796
  • 10
  • 18
18
votes
4 answers

How do I create padded batches in Tensorflow for tf.train.SequenceExample data using the DataSet API?

For training an LSTM model in Tensorflow, I have structured my data into a tf.train.SequenceExample format and stored it into a TFRecord file. I would now like to use the new DataSet API to generate padded batches for training. In the documentation…
Marijn Huijbregts
  • 183
  • 1
  • 1
  • 6
18
votes
1 answer

How to construct input data to LSTM for time series multi-step horizon with external features?

I'm trying to use LSTM to do store sales forecast. Here is how my raw data look like: | Date | StoreID | Sales | Temperature | Open | StoreType | |------------|---------|-------|-------------|---------|-----------| | 01/01/2016 | 1 | …
18
votes
4 answers

What are c_state and m_state in Tensorflow LSTM?

Tensorflow r0.12's documentation for tf.nn.rnn_cell.LSTMCell describes this as the init: tf.nn.rnn_cell.LSTMCell.__call__(inputs, state, scope=None) where state is as follows: state: if state_is_tuple is False, this must be a state Tensor, 2-D,…
Haziq Nordin
  • 195
  • 1
  • 3
  • 10
17
votes
1 answer

Adding Attention on top of simple LSTM layer in Tensorflow 2.0

I have a simple network of one LSTM and two Dense layers as such: model = tf.keras.Sequential() model.add(layers.LSTM(20, input_shape=(train_X.shape[1], train_X.shape[2]))) model.add(layers.Dense(20, activation='sigmoid')) model.add(layers.Dense(1,…
greco.roamin
  • 799
  • 1
  • 6
  • 20
17
votes
2 answers

Does attention make sense for Autoencoders?

I am struggling with the concept of attention in the the context of autoencoders. I believe I understand the usage of attention with regards to seq2seq translation - after training the combined encoder and decoder, we can use both encoder and…