Mxnet Gluon custom data iterator

Question

I have written a custom data iterator using mx.io.DataIter class. What's the easiest way to use this data iterator with Gluon interface?

I went through the documentation and couldn't find an easy way to do so. One of my idea was to use it as iterator and get data adn label from each batch as follows.

for e in range(epochs):
    train_iter.reset()
    for batch_data in train_iter:
        data = nd.concatenate(([d for d in batch_data.data]))
        label = nd.concatenate(([l for l in batch_data.label]))
        with autograd.record():
            output = net(data)
            loss = softmax_cross_entropy(output, label)
        loss.backward()
        trainer.step(batch_size)
        print(nd.mean(loss).asscalar())

But this may not be optimal as I need to concatenate per batch.

What's the optimal way to achieve this? i.e. is there a systematic
way to write a simple custom iterator for gluon?
How do I add context information in above cases?

Did you ever find out how to add context info? – anana Jul 03 '18 at 16:18 — anana, Jul 03 '18 at 16:18

score 1 · Answer 1 · answered Nov 17 '17 at 19:35

1

I think your approach works. Basically you can get data from batch_data.data and label from batch_data.label and feed them into the network.

I'm not sure why you need to concat the data and labels - maybe it's to do with your network definition.

If you ever need to split the data and train on multiple GPUs, you can use the gluon.utils.split_and_load function to do that.

answered Nov 17 '17 at 19:35

eric-haibin-lin

377
2
9

Thanks Eric. I just wished there was a much cleaner way to implement batch data. Not having a gluon compatible API seems very strange and surprising! – krishnakamathk Nov 22 '17 at 21:33

Mxnet Gluon custom data iterator

1 Answers1