Questions tagged [recurrent-neural-network]

A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed cycle.

3356 questions
172
votes
5 answers

Why do we "pack" the sequences in PyTorch?

I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence. I understand why we "pad" them but why is "packing" (via pack_padded_sequence)…
aerin
  • 20,607
  • 28
  • 102
  • 140
159
votes
2 answers

Many to one and many to many LSTM examples in Keras

I try to understand LSTMs and how to build them with Keras. I found out, that there are principally the 4 modes to run a RNN (the 4 right ones in the picture) Image source: Andrej Karpathy Now I wonder how a minimalistic code snippet for each of…
109
votes
5 answers

What's the difference between "hidden" and "output" in PyTorch LSTM?

I'm having trouble understanding the documentation for PyTorch's LSTM module (and also RNN and GRU, which are similar). Regarding the outputs, it says: Outputs: output, (h_n, c_n) output (seq_len, batch, hidden_size * num_directions): tensor…
N. Virgo
  • 7,970
  • 11
  • 44
  • 65
102
votes
2 answers

What is the intuition of using tanh in LSTM?

In an LSTM network (Understanding LSTMs), why does the input gate and output gate use tanh? What is the intuition behind this? It is just a nonlinear transformation? If it is, can I change both to another activation function (e.g., ReLU)?
98
votes
5 answers

What's the difference between a bidirectional LSTM and an LSTM?

Can someone please explain this? I know bidirectional LSTMs have a forward and backward pass but what is the advantage of this over a unidirectional LSTM? What is each of them better suited for?
67
votes
8 answers

What's the difference between convolutional and recurrent neural networks?

I'm new to the topic of neural networks. I came across the two terms convolutional neural network and recurrent neural network. I'm wondering if these two terms are referring to the same thing, or, if not, what would be the difference between them?
Tal_
  • 761
  • 2
  • 6
  • 13
64
votes
2 answers

How to use return_sequences option and TimeDistributed layer in Keras?

I have a dialog corpus like below. And I want to implement a LSTM model which predicts a system action. The system action is described as a bit vector. And a user input is calculated as a word-embedding which is also a bit vector. t1: user: "Do you…
jef
  • 3,890
  • 10
  • 42
  • 76
62
votes
4 answers

How do I create a variable-length input LSTM in Keras?

I am trying to do some vanilla pattern recognition with an LSTM using Keras to predict the next element in a sequence. My data look like this: where the label of the training sequence is the last element in the list:…
erip
  • 16,374
  • 11
  • 66
  • 121
59
votes
11 answers

What is num_units in tensorflow BasicLSTMCell?

In MNIST LSTM examples, I don't understand what "hidden layer" means. Is it the imaginary-layer formed when you represent an unrolled RNN over time? Why is the num_units = 128 in most cases ?
Subrat
  • 980
  • 2
  • 11
  • 17
58
votes
6 answers

No module named 'tqdm'

I am running the following pixel recurrent neural network (RNN) code using Python 3.6 import os import logging import numpy as np from tqdm import trange import tensorflow as tf from utils import * from network import Network from statistic import…
A. Syam
  • 759
  • 1
  • 6
  • 9
54
votes
2 answers

Pytorch - RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed

I keep running into this error: RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time. I had searched in Pytorch forum, but still…
Viet Phan
  • 1,999
  • 3
  • 23
  • 40
41
votes
1 answer

Shuffling training data with LSTM RNN

Since an LSTM RNN uses previous events to predict current sequences, why do we shuffle the training data? Don't we lose the temporal ordering of the training data? How is it still effective at making predictions after being trained on shuffled…
hellowill89
  • 1,538
  • 2
  • 15
  • 26
40
votes
4 answers

ValueError: Input 0 is incompatible with layer lstm_13: expected ndim=3, found ndim=4

I am trying for multi-class classification and here are the details of my training input and output: train_input.shape= (1, 95000, 360) (95000 length input array with each element being an array of 360 length) train_output.shape = (1, 95000, 22)…
Urja Pawar
  • 1,087
  • 1
  • 15
  • 29
38
votes
1 answer

Understanding Keras LSTMs: Role of Batch-size and Statefulness

Sources There are several sources out there explaining stateful / stateless LSTMs and the role of batch_size which I've read already. I'll refer to them later in my post: [1]…
ascripter
  • 5,665
  • 12
  • 45
  • 68
36
votes
3 answers

Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)

My input is simply a csv file with 339732 rows and two columns : the first being 29 feature values, i.e. X the second being a binary label value, i.e. Y I am trying to train my data on a stacked LSTM model: data_dim = 29 timesteps = 8 num_classes…
Saurav--
  • 1,530
  • 2
  • 15
  • 33
1
2 3
99 100