Questions tagged [lstm]

Long short-term memory. A neural network (NN) architecture that contains recurrent NN blocks that can remember a value for an arbitrary length of time. A very popular building block for deep NN.

Long short-term memory neural networks (LSTMs) are a subset of recurrent neural networks. They can take time-series data and make predictions using knowledge of how the system is evolving.

A major benefit to LSTMs is their ability to store and utilize long-term information, not just what they are provided at a particular instance. For more information on LSTMs check out these links from colah's blog post and MachineLearningMastery.

6289 questions
17
votes
1 answer

Running LSTM with multiple GPUs gets "Input and hidden tensors are not at the same device"

I am trying to train a LSTM layer in pytorch. I am using 4 GPUs. When initializing, I added the .cuda() function move the hidden layer to GPU. But when I run the code with multiple GPUs I am getting this runtime error : RuntimeError: Input and…
ida
  • 1,011
  • 1
  • 9
  • 17
17
votes
3 answers

Bidirectional LSTM output question in PyTorch

Hi I have a question about how to collect the correct result from a BI-LSTM module’s output. Suppose I have a 10-length sequence feeding into a single-layer LSTM module with 100 hidden units: lstm = nn.LSTM(5, 100, 1, bidirectional=True) output…
Crt Tax
  • 378
  • 1
  • 2
  • 11
17
votes
2 answers

Keras EarlyStopping: Which min_delta and patience to use?

I am new to deep learning and Keras and one of the improvement I try to make to my model training process is to make use of Keras's keras.callbacks.EarlyStopping callback function. Based on the output from training my model, does it seem reasonable…
Nyxynyx
  • 61,411
  • 155
  • 482
  • 830
17
votes
2 answers

Using Keras for video prediction (time series)

I want to predict the next frame of a (greyscale) video given N previous frames - using CNNs or RNNs in Keras. Most tutorials and other information regarding time series prediction and Keras use a 1-dimensional input in their network but mine would…
Isa
  • 1,121
  • 3
  • 10
  • 17
17
votes
4 answers

Initializing LSTM hidden state Tensorflow/Keras

Can someone explain how can I initialize hidden state of LSTM in tensorflow? I am trying to build LSTM recurrent auto-encoder, so after i have that model trained i want to transfer learned hidden state of unsupervised model to hidden state of…
Tommy
  • 189
  • 1
  • 2
  • 10
17
votes
3 answers

Implementing Bi-directional LSTM-CRF Network

I need to implement a bidirectional LSTM network with a CRF layer at the end. Specifically the model presented in this paper, and train it. http://www.aclweb.org/anthology/P15-1109 I want to implement it in Python preferably. Can anyone present some…
Samik
  • 390
  • 1
  • 3
  • 9
16
votes
7 answers

PyTorch Model Training: RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

After training a PyTorch model on a GPU for several hours, the program fails with the error RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR Training Conditions Neural Network: PyTorch 4-layer nn.LSTM with nn.Linear output Deep Q Network…
Athena Wisdom
  • 6,101
  • 9
  • 36
  • 60
16
votes
1 answer

What exactly is timestep in an LSTM Model?

I am a newbie to LSTM and RNN as a whole, I've been racking my brain to understand what exactly is a timestep. I would really appreciate an intuitive explanation to this
Steven Wang
  • 335
  • 1
  • 2
  • 7
16
votes
1 answer

Doubts regarding `Understanding Keras LSTMs`

I am new to LSTMs and going through the Understanding Keras LSTMs and had some silly doubts related to a beautiful answer by Daniel Moller. Here are some of my doubts: There are 2 ways specified under the Achieving one to many section where it’s…
asn
  • 2,408
  • 5
  • 23
  • 37
16
votes
3 answers

Keras LSTM Autoencoder time-series reconstruction

I am trying to reconstruct time series data with LSTM Autoencoder (Keras). Now I want train autoencoder on small amount of samples (5 samples, every sample is 500 time-steps long and have 1 dimension). I want to make sure that model can reconstruct…
Tombozik
  • 161
  • 1
  • 5
16
votes
2 answers

How to Merge Numerical and Embedding Sequential Models to treat categories in RNN

I would like to build a one layer LSTM model with embeddings for my categorical features. I currently have numerical features and a few categorical features, such as Location, which can't be one-hot encoded e.g. using pd.get_dummies() due to…
GRS
  • 2,807
  • 4
  • 34
  • 72
16
votes
2 answers

How to combine numerical and categorical values in a vector as input for LSTM?

import pandas as pd import numpy as np rands = np.random.random(7) days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'] dates = pd.date_range('2018-01-01', '2018-01-07') df = pd.DataFrame({'date': dates, 'days':…
TheDarkKnight
  • 401
  • 1
  • 4
  • 9
16
votes
2 answers

Keras LSTM for Text Generation keeps repeating a line or a sequence

I roughly followed this tutorial: https://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/ A notable difference is that I use 2 LSTM layers with dropout. My data set is different (music data-set in abc…
16
votes
2 answers

Attention Layer throwing TypeError: Permute layer does not support masking in Keras

I have been following this post in order to implement attention layer over my LSTM model. Code for the attention layer: INPUT_DIM = 2 TIME_STEPS = 20 SINGLE_ATTENTION_VECTOR = False APPLY_ATTENTION_BEFORE_LSTM = False def…
Saurav--
  • 1,530
  • 2
  • 15
  • 33
16
votes
2 answers

TypeError: can't pickle _thread.lock objects in Seq2Seq

I'm having trouble using buckets in my Tensorflow model. When I run it with buckets = [(100, 100)], it works fine. When I run it with buckets = [(100, 100), (200, 200)] it doesn't work at all (stacktrace at bottom). Interestingly, running…
Evan Weissburg
  • 1,564
  • 2
  • 17
  • 38