In chainer, how to write BPTT updater using trainer exactly?

Asked Aug 12 '17 at 16:06

Active Aug 21 '17 at 01:58

Viewed 125 times

chainer document RNN tutorial has incorrect code in this page: https://docs.chainer.org/en/stable/tutorial/recurrentnet.html

def update_bptt(updater):
    loss = 0
    for i in range(35):
        batch = train_iter.__next__()
        x, t = chainer.dataset.concat_examples(batch)
        loss += model(chainer.Variable(x), chainer.Variable(t))

    model.cleargrads()
    loss.backward()
    loss.unchain_backward()  # truncate
    optimizer.update()

updater = training.StandardUpdater(train_iter, optimizer, **update_bptt**)

the training.StandardUpdater 3rd parameter is converter=concat_example, not update function. How to write BPTT using trainer exactly?

edited Aug 12 '17 at 16:08

jonrsharpe

115,751
26
228
437

asked Aug 12 '17 at 16:06

machen

I'm also not sure about the document above. But you can refer chainer official example, it has sample implementation of BPTTUpdater. https://github.com/chainer/chainer/blob/master/examples/ptb/train_ptb.py – corochann Aug 13 '17 at 23:49
Because chainer.training.ParallelUpdater has ability to train on multiple GPU, how to add this ability to BPTT updater. This example didn't show this. – machen Aug 14 '17 at 06:18
@corochann How to write BPTT updater using multiple GPUs? I don't find example because exsiting method only extends training.StandardUpdater, thus only use One GPU. – machen Aug 21 '17 at 01:59

In chainer, how to write BPTT updater using trainer exactly?

0 Answers0