0

I firstly use L.LSTM , then I found this NStepLSTM, which is uncovered part of offical tutorial document. https://docs.chainer.org/en/stable/reference/generated/chainer.links.NStepLSTM.html?highlight=Nstep

  1. Why does chainer.links.NStepLSTM or chainer.links.NStepBiLSTM not have reset_state? how to reset_state?

  2. is it pass a list of sequences(each is one sequence chainer.Variable, e.g. one article contains multiple words is one Variable)? Is this class purpose is to deal with vary length sequence?

  3. can we use truncate BPTT to save memory in chainer.links.NStepLSTM ? how

machen
  • 283
  • 2
  • 10

1 Answers1

0

1. NStepLSTM gets a batch of sequences and returns a batch of output sequences, though LSTM gets a batch of words. You don't need to use for-loop to use NStepLSTM. NStepLSTM uses cuDNN, that is a library NVIDIA provides, and is very fast. NStepLSTM does not have a state. If you want to chain NStepLSTMs, use outputs of NStepLSTM. See seq2seq example: https://github.com/chainer/chainer/blob/master/examples/seq2seq/seq2seq.py

2. Yes. It gots such as a batch of sequences of embed vectors created from sentences. You can use sequences with different lengths. See seq2seq example. Note that L.NStepLSTM can get a sequence of sentences, but F.NStepLSTM can get transposed sequences. I mean it can get a sequence of batches of words. Actually L.NStepLSTM calls F.transpose_sequences and F.NStepLSTM in its implementation.

3. Sorry it is difficult. As I said, NStepLSTM is a wrapper of cuDNN's RNN library.It does not support BPTT. Of course you can split sentences and call NStepLSTM twice.

  • You mean I don't need to write BPTT updater when using NStepLSTM? I have to implement a situation where the each timestep's output of NStepLSTM will be concatenated , and then to feed input of the next layer( The layer after NStepLSTM will use all time step of NStepLSTM), In this case, How to write BPTT updater? Or you mean I need not to explicit write BPTT updater? – machen Oct 02 '17 at 08:32
  • does input of NStepLSTM that list of variable: between each variable inside this list need to have temporal correlation??? – machen Mar 01 '18 at 08:17
  • If I want to only use last time step to calculate loss, because NStepLSTM return 3 variable, should I use the the last one or the first one? – machen Mar 04 '18 at 09:47