A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed cycle.
Questions tagged [recurrent-neural-network]
3356 questions
172
votes
5 answers
Why do we "pack" the sequences in PyTorch?
I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence.
I understand why we "pad" them but why is "packing" (via pack_padded_sequence)…

aerin
- 20,607
- 28
- 102
- 140
159
votes
2 answers
Many to one and many to many LSTM examples in Keras
I try to understand LSTMs and how to build them with Keras. I found out, that there are principally the 4 modes to run a RNN (the 4 right ones in the picture)
Image source: Andrej Karpathy
Now I wonder how a minimalistic code snippet for each of…

Luca Thiede
- 3,229
- 4
- 21
- 32
109
votes
5 answers
What's the difference between "hidden" and "output" in PyTorch LSTM?
I'm having trouble understanding the documentation for PyTorch's LSTM module (and also RNN and GRU, which are similar). Regarding the outputs, it says:
Outputs: output, (h_n, c_n)
output (seq_len, batch, hidden_size * num_directions): tensor…

N. Virgo
- 7,970
- 11
- 44
- 65
102
votes
2 answers
What is the intuition of using tanh in LSTM?
In an LSTM network (Understanding LSTMs), why does the input gate and output gate use tanh?
What is the intuition behind this?
It is just a nonlinear transformation? If it is, can I change both to another activation function (e.g., ReLU)?

DNK
- 1,448
- 2
- 13
- 12
98
votes
5 answers
What's the difference between a bidirectional LSTM and an LSTM?
Can someone please explain this? I know bidirectional LSTMs have a forward and backward pass but what is the advantage of this over a unidirectional LSTM?
What is each of them better suited for?

shekit
- 1,493
- 1
- 14
- 15
67
votes
8 answers
What's the difference between convolutional and recurrent neural networks?
I'm new to the topic of neural networks. I came across the two terms convolutional neural network and recurrent neural network.
I'm wondering if these two terms are referring to the same thing, or, if not, what would be the difference between them?

Tal_
- 761
- 2
- 6
- 13
64
votes
2 answers
How to use return_sequences option and TimeDistributed layer in Keras?
I have a dialog corpus like below. And I want to implement a LSTM model which predicts a system action. The system action is described as a bit vector. And a user input is calculated as a word-embedding which is also a bit vector.
t1: user: "Do you…

jef
- 3,890
- 10
- 42
- 76
62
votes
4 answers
How do I create a variable-length input LSTM in Keras?
I am trying to do some vanilla pattern recognition with an LSTM using Keras to predict the next element in a sequence.
My data look like this:
where the label of the training sequence is the last element in the list:…

erip
- 16,374
- 11
- 66
- 121
59
votes
11 answers
What is num_units in tensorflow BasicLSTMCell?
In MNIST LSTM examples, I don't understand what "hidden layer" means. Is it the imaginary-layer formed when you represent an unrolled RNN over time?
Why is the num_units = 128 in most cases ?

Subrat
- 980
- 2
- 11
- 17
58
votes
6 answers
No module named 'tqdm'
I am running the following pixel recurrent neural network (RNN) code using Python 3.6
import os
import logging
import numpy as np
from tqdm import trange
import tensorflow as tf
from utils import *
from network import Network
from statistic import…

A. Syam
- 759
- 1
- 6
- 9
54
votes
2 answers
Pytorch - RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed
I keep running into this error:
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
I had searched in Pytorch forum, but still…

Viet Phan
- 1,999
- 3
- 23
- 40
41
votes
1 answer
Shuffling training data with LSTM RNN
Since an LSTM RNN uses previous events to predict current sequences, why do we shuffle the training data? Don't we lose the temporal ordering of the training data? How is it still effective at making predictions after being trained on shuffled…

hellowill89
- 1,538
- 2
- 15
- 26
40
votes
4 answers
ValueError: Input 0 is incompatible with layer lstm_13: expected ndim=3, found ndim=4
I am trying for multi-class classification and here are the details of my training input and output:
train_input.shape= (1, 95000, 360) (95000 length input array with each
element being an array of 360 length)
train_output.shape = (1, 95000, 22)…

Urja Pawar
- 1,087
- 1
- 15
- 29
38
votes
1 answer
Understanding Keras LSTMs: Role of Batch-size and Statefulness
Sources
There are several sources out there explaining stateful / stateless LSTMs and the role of batch_size which I've read already. I'll refer to them later in my post:
[1]…

ascripter
- 5,665
- 12
- 45
- 68
36
votes
3 answers
Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)
My input is simply a csv file with 339732 rows and two columns :
the first being 29 feature values, i.e. X
the second being a binary label value, i.e. Y
I am trying to train my data on a stacked LSTM model:
data_dim = 29
timesteps = 8
num_classes…

Saurav--
- 1,530
- 2
- 15
- 33