Questions tagged [lstm]

Long short-term memory. A neural network (NN) architecture that contains recurrent NN blocks that can remember a value for an arbitrary length of time. A very popular building block for deep NN.

Long short-term memory neural networks (LSTMs) are a subset of recurrent neural networks. They can take time-series data and make predictions using knowledge of how the system is evolving.

A major benefit to LSTMs is their ability to store and utilize long-term information, not just what they are provided at a particular instance. For more information on LSTMs check out these links from colah's blog post and MachineLearningMastery.

6289 questions
29
votes
2 answers

TensorFlow: Remember LSTM state for next batch (stateful LSTM)

Given a trained LSTM model I want to perform inference for single timesteps, i.e. seq_length = 1 in the example below. After each timestep the internal LSTM (memory and hidden) states need to be remembered for the next 'batch'. For the very…
verified.human
  • 1,287
  • 3
  • 17
  • 26
28
votes
2 answers

How to setup 1D-Convolution and LSTM in Keras

I would like to use 1D-Conv layer following by LSTM layer to classify a 16-channel 400-timestep signal. The input shape is composed of: X = (n_samples, n_timesteps, n_features), where n_samples=476, n_timesteps=400, n_features=16 are the number of…
Thuan N.
  • 369
  • 1
  • 5
  • 11
27
votes
2 answers

Difference between 1 LSTM with num_layers = 2 and 2 LSTMs in pytorch

I am new to deep learning and currently working on using LSTMs for language modeling. I was looking at the pytorch documentation and was confused by it. If I create a nn.LSTM(input_size, hidden_size, num_layers) where hidden_size = 4 and…
user3828311
  • 907
  • 4
  • 11
  • 20
27
votes
1 answer

Using pre-trained word2vec with LSTM for word generation

LSTM/RNN can be used for text generation. This shows way to use pre-trained GloVe word embeddings for Keras model. How to use pre-trained Word2Vec word embeddings with Keras LSTM model? This post did help. How to predict / generate next word when…
Vishal Shukla
  • 277
  • 1
  • 5
  • 11
26
votes
2 answers

What is the difference between the terms accuracy and validation accuracy

I have used LSTM from Keras to build a model that can detect if two questions on Stack overflow are duplicate or not. When I run the model I see the following output in the epochs. Epoch 23/200 727722/727722 [==============================] - 67s -…
Dookoto_Sea
  • 521
  • 1
  • 5
  • 16
26
votes
1 answer

How do I train tesseract 4 with image data instead of a font file?

I'm trying to train Tesseract 4 with images instead of fonts. In the docs they are explaining only the approach with fonts, not with images. I know how it works, when I use a prior version of Tesseract but I didn't get how to use the box/tiff…
claim
  • 506
  • 6
  • 13
25
votes
2 answers

Understanding Tensorflow LSTM Input shape

I have a dataset X which consists N = 4000 samples, each sample consists of d = 2 features (continuous values) spanning back t = 10 time steps. I also have the corresponding 'labels' of each sample which are also continuous values, at time step 11.…
Renier Botha
  • 830
  • 1
  • 10
  • 19
25
votes
6 answers

How to calculate the number of parameters of an LSTM network?

Is there a way to calculate the total number of parameters in a LSTM network. I have found a example but I'm unsure of how correct this is or If I have understood it correctly. For eg consider the following example:- from keras.models import…
Arsenal Fanatic
  • 3,663
  • 6
  • 38
  • 53
25
votes
1 answer

Siamese Neural Network in TensorFlow

I'm trying to implement a Siamese Neural Network in TensorFlow but I cannot really find any working example on the Internet (see Yann LeCun paper). The architecture I'm trying to build would consist of two LSTMs sharing weights and only connected…
BiBi
  • 7,418
  • 5
  • 43
  • 69
25
votes
5 answers

How to deal with batches with variable-length sequences in TensorFlow?

I was trying to use an RNN (specifically, LSTM) for sequence prediction. However, I ran into an issue with variable sequence lengths. For example, sent_1 = "I am flying to Dubain" sent_2 = "I was traveling from US to Dubai" I am trying to…
Seja Nair
  • 787
  • 2
  • 9
  • 23
24
votes
1 answer

Proper way to feed time-series data to stateful LSTM?

Let's suppose I have a sequence of integers: 0,1,2, .. and want to predict the next integer given the last 3 integers, e.g.: [0,1,2]->5, [3,4,5]->6, etc Suppose I setup my model like so: batch_size=1 time_steps=3 model =…
rmccabe3701
  • 1,418
  • 13
  • 31
24
votes
2 answers

PyTorch: manually setting weight parameters with numpy array for GRU / LSTM

I'm trying to fill up GRU/LSTM with manually defined parameters in pytorch. I have numpy arrays for parameters with shapes as defined in their documentation (https://pytorch.org/docs/stable/nn.html#torch.nn.GRU). It seems to work but I'm not sure…
ytrewq
  • 3,670
  • 9
  • 42
  • 71
23
votes
1 answer

expected ndim=3, found ndim=2

I'm new with Keras and I'm trying to implement a Sequence to Sequence LSTM. Particularly, I have a dataset with 9 features and I want to predict 5 continuous values. I split the training and the test set and their shape are respectively: X TRAIN…
mht
  • 381
  • 1
  • 2
  • 12
23
votes
3 answers

Predicting a multiple forward time step of a time series using LSTM

I want to predict certain values that are weekly predictable (low SNR). I need to predict the whole time series of a year formed by the weeks of the year (52 values - Figure 1) My first idea was to develop a many-to-many LSTM model (Figure 2) using…
Lucas Brito
  • 1,028
  • 1
  • 18
  • 36
23
votes
2 answers

Tree-LSTM in Keras

I would like to use a tree-LSTM in keras, similar to what is described in this article: https://arxiv.org/abs/1503.00075. It is essentially similar to a Long Short-Term Memory network, but with a tree-like input sequence instead of a chain-like…