Questions tagged [lstm]

Long short-term memory. A neural network (NN) architecture that contains recurrent NN blocks that can remember a value for an arbitrary length of time. A very popular building block for deep NN.

Long short-term memory neural networks (LSTMs) are a subset of recurrent neural networks. They can take time-series data and make predictions using knowledge of how the system is evolving.

A major benefit to LSTMs is their ability to store and utilize long-term information, not just what they are provided at a particular instance. For more information on LSTMs check out these links from colah's blog post and MachineLearningMastery.

6289 questions
2
votes
1 answer

Tensorflow: Can't overfit training data with batch size > 1

I coded a small RNN network with Tensorflow to return the total energy consumption given some parameters. There seem to be a problem in my code. It can't overfit the training data when I use a batch size > 1 (even with only 4 samples!). In the code…
2
votes
0 answers

How to use a Keras LSTM with batch size of 1 ? (avoid zero padding)

I want to train a Keras LSTM using a batch size of one. In this case I wouldn't need zero padding so far I understood. The necessity for zero padding comes from equalizing the size of the batches, right? It turns out this is not as easy as I…
toobee
  • 2,592
  • 4
  • 26
  • 35
2
votes
1 answer

LSTM network begins generating junk after a few iterations

I have a large text corpus of about ~7M characters that I am training an LSTM network on. However I am consistently seeing that after about the 5th epoch, instead of the generated sentences improving they become completely junk. I have pasted an…
shekit
  • 1,493
  • 1
  • 14
  • 15
2
votes
1 answer

Implementing a Generative RNN with continuous input and discrete output

I am currently using a generative RNN to classify indices in a sequence (sort of saying whether something is noise or not noise). My input in continuous (i.e. a real value between 0 and 1) and my output is either a (0 or 1). For example, if the…
2
votes
1 answer

How to prepare time series data for multi step and multi variable in LSTM Keras

Firstly I am new to Keras. I have the following case: Time series data with 15 feature held in pandas dataframe Time series data is hourly. So I want to predict next 16 hourly time series data. I want to give input (16 time series data) ,…
yunus kula
  • 859
  • 3
  • 10
  • 31
2
votes
1 answer

How to enforce rules like move legality in chess at the output of a neural network?

How do I apply rules, like chess rules, to a neural network, so the network doesn't predict/train invalid moves?
R.Schaefer
  • 123
  • 1
  • 9
2
votes
1 answer

Tensorflow. ValueError: The two structures don't have the same number of elements

My current code for implementing encoder lstm using raw_rnn. This question is also related to another question I asked before (Tensorflow raw_rnn retrieve tensor of shape BATCH x DIM from embedding matrix). When I run the following code I get the…
Asterisk
  • 3,534
  • 2
  • 34
  • 53
2
votes
1 answer

Stacking LSTM layers/cells in tensorflow

I am trying to stack LSTM cells in TF, this is what I have: for layer in xrange(args.num_layers): cell_fw = tf.contrib.rnn.LSTMCell(args.hidden_size, initializer=tf.orthogonal_initializer()) cell_bw =…
2
votes
1 answer

How to speed up the training of an RNN model with multiple GPUs in TensorFlow?

For example, the RNN is a dynamic 3-layer bidirectional LSTM with the hidden vector size of 200 (tf.nn.bidirectional_dynamic_rnn) and I have 4 GPUs to train the model. I saw a post using data parallelism on subsets of samples in a batch but that…
2
votes
1 answer

LSTM with one sequence feature and 3 current features

I have a question about using LSTM model to predict sales. To build the model, I want to input the previous sales values(choose sample size of 30), and also the current(t) features such as whether on promotion, whether is a holiday etc. My current…
Icy
  • 31
  • 1
  • 3
2
votes
2 answers

Tensorflow: how to obtain intermediate cell states (c) from LSTMCell using dynamic_rnn?

By default, function dynamic_rnn outputs only hidden states (known as m) for each time point which can be obtained as follows: cell = tf.contrib.rnn.LSTMCell(100) rnn_outputs, _ = tf.nn.dynamic_rnn(cell, …
2
votes
1 answer

Using dropout with CudnnLSTM for training and validation

I am trying to use dropout with CudnnLSTM (tf.contrib.cudnn_rnn.python.layers.CudnnLSTM), and I would like to be able to build just one graph and set the dropout to some non-zero fractional value for training and then set the dropout to 0 for…
TFdoe
  • 571
  • 5
  • 16
2
votes
1 answer

Why return sequences in stacked RNNs?

When stacking RNNs, it is mandatory to set return_sequences parameter as True in Keras. For instance in Keras, lstm1 = LSTM(1, return_sequences=True)(inputs1) lstm2 = LSTM(1)(lstm1) It is somewhat intuitive to preserve the dimensionality of input…
Buomsoo Kim
  • 1,283
  • 2
  • 9
  • 5
2
votes
1 answer

Transforming mLSTM - Run it on multiple GPUs

I'm running an mLSTM (multiplicative LSTM) transform (based on mLSTM by OpenAi (just the transform, it is already trained) but it takes a really long time to transform more than ~100,000 docs. I want it to run on multiple GPUs. I saw some examples…
Lior Magen
  • 1,533
  • 2
  • 15
  • 33
2
votes
2 answers

LSTM error with date format

This is my first attempt in deep learning, the purpose of this code is to predict the FOREX market direction. Here is the code: import matplotlib.pyplot as plt import numpy as np import pandas as pd from sklearn.preprocessing import…
Sayed Gouda
  • 605
  • 3
  • 9
  • 22