Questions tagged [lstm]

Long short-term memory. A neural network (NN) architecture that contains recurrent NN blocks that can remember a value for an arbitrary length of time. A very popular building block for deep NN.

Long short-term memory neural networks (LSTMs) are a subset of recurrent neural networks. They can take time-series data and make predictions using knowledge of how the system is evolving.

A major benefit to LSTMs is their ability to store and utilize long-term information, not just what they are provided at a particular instance. For more information on LSTMs check out these links from colah's blog post and MachineLearningMastery.

6289 questions
14
votes
1 answer

What is a "cell class" in Keras?

Or, more specific: what is the difference between ConvLSTM2D and ConvLSTM2DCell? What is the difference between SimpleRNN and SimpleRNNCell? Same question for GRU and GRUCell Keras manuals are not very verbose here. I can see from RTFS (reading…
wl2776
  • 4,099
  • 4
  • 35
  • 77
14
votes
4 answers

Keras LSTM predicted timeseries squashed and shifted

I'm trying to get some hands on experience with Keras during the holidays, and I thought I'd start out with the textbook example of timeseries prediction on stock data. So what I'm trying to do is given the last 48 hours worth of average price…
cdecker
  • 4,515
  • 8
  • 46
  • 75
14
votes
3 answers

How to handle extremely long LSTM sequence length?

I have some data that is sampled at at a very high rate (on the order of hundreds of times per second). This results in a sequence length that is huge (~90,000 samples) on average for any given instance. This entire sequence has a single label. I am…
user
  • 199
  • 1
  • 1
  • 12
14
votes
3 answers

Regularization for LSTM in tensorflow

Tensorflow offers a nice LSTM wrapper. rnn_cell.BasicLSTM(num_units, forget_bias=1.0, input_size=None, state_is_tuple=False, activation=tanh) I would like to use regularization, say L2 regularization. However, I don't have direct access…
BiBi
  • 7,418
  • 5
  • 43
  • 69
14
votes
2 answers

TensorFlow using LSTMs for generating text

I would like to use tensorflow to generate text and have been modifying the LSTM tutorial (https://www.tensorflow.org/versions/master/tutorials/recurrent/index.html#recurrent-neural-networks) code to do this, however my initial solution seems to…
seberik
  • 405
  • 1
  • 6
  • 13
13
votes
4 answers

Why not use mean squared error for classification problems?

I am trying to solve a simple binary classification problem using LSTM. I am trying to figure out the correct loss function for the network. The issue is, when I use the binary cross-entropy as loss function, the loss value for training and testing…
Hussain Ali
  • 133
  • 1
  • 1
  • 4
13
votes
0 answers

LSTM object detection tensorflow

Long story short: How to prepare data for lstm object detection retraining of the tensorflow master github implementation. Long story: Hi all, I recently found implementation a lstm object detection algorithm based on this…
13
votes
2 answers

Keras - Add attention mechanism to an LSTM model

With the following code: model = Sequential() num_features = data.shape[2] num_samples = data.shape[1] model.add( LSTM(16, batch_input_shape=(None, num_samples, num_features), return_sequences=True,…
Shlomi Schwartz
  • 8,693
  • 29
  • 109
  • 186
13
votes
3 answers

Python Keras: An layer output exactly the same thing as input

I am using Keras to build a Network. During the process, I need a layer, which takes an LSTM input, doing nothing, just output exactly the same as input. i.e. if each input record of LSTM is like [[A_t1, A_t2, A_t3, A_t4, A_t5, A_t6]], I am looking…
Edamame
  • 23,718
  • 73
  • 186
  • 320
13
votes
4 answers

Keras LSTM - why different results with "same" model & same weights?

(NOTE: Properly fixing the RNG state before each model creating as described in comment in comment practically fixed my problem, as within 3 decimals results are consistent, but they aren't exactly so, so there's somewhere a hidden source of…
NeuronQ
  • 7,527
  • 9
  • 42
  • 60
13
votes
4 answers

Prevent over-fitting of text classification using Word embedding with LSTM

Objective : Identifying class label using user entered question (like Question Answer system). Data extracted from Big PDF file, and need to predict page number based on user input. Majorly used in policy document, where user have question about…
Somnath Kadam
  • 6,051
  • 6
  • 21
  • 37
13
votes
1 answer

Tensorflow Serving - Stateful LSTM

Is there a canonical way to maintain a stateful LSTM, etc. with Tensorflow Serving? Using the Tensorflow API directly this is straightforward - but I'm not certain how best to accomplish persisting LSTM state between calls after exporting the model…
13
votes
4 answers

Why do I get a Keras LSTM RNN input_shape error?

I keep getting an input_shape error from the following code. from keras.models import Sequential from keras.layers.core import Dense, Activation, Dropout from keras.layers.recurrent import LSTM def _load_data(data): """ data should be…
Ravaal
  • 3,233
  • 6
  • 39
  • 66
12
votes
2 answers

Input 0 of layer conv1d is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (None, 30)

I have been working on a project for estimating the traffic flow using time series data combine with weather data. I am using a window of 30 values for my time series and I am using 20 weather related features. I have used the functional API to…
Minura Punchihewa
  • 1,498
  • 1
  • 12
  • 35
12
votes
1 answer

LSTM Autoencoder problems

TLDR: Autoencoder underfits timeseries reconstruction and just predicts average value. Question Set-up: Here is a summary of my attempt at a sequence-to-sequence autoencoder. This image was taken from this paper:…
rocksNwaves
  • 5,331
  • 4
  • 38
  • 77