1

I learn MXNet framework and try to run example of object detection with SSD: https://gluon.mxnet.io/chapter08_computer-vision/object-detection.html

I use GPU is NVidia GTX 1050, 4GB for training. I work in Jupyter notebook. Versions: Python 3.6, MXNet 1.3.1.

It was said in the tutorial that training from scratch takes about 30 minutes with one GPU. I stopped after 3 hours. The model had processed 24459 batches (batch has size of 32) when I interrupted training. Whole dataset has size of 87.7MB that is less than 24459*32*256*256 (size of image is 256x256). I can't understand why it may takes too much time. Are there maybe any particular features of image.ImageDetIter (for example the one does never stopped by itself)?

stansf
  • 13
  • 3

1 Answers1

0

Thanks for including the version info. You're absolutely correct - there was a bug in MXNet 1.3.0 where ImageDetIter looped indefinitely on the example you had. This was fixed Dec 2018 and if you upgrade to MXNet 1.4.0 you won't see the issue. I confirmed this by running the code above.

Another important note, "Deep Learning - The Straight Dope" has been deprecated in favor of (Dive into Deep Learning](d2l.ai). The content is updated and being used for a course on MXNet. Here's the corresponding chapter in the book.

Additionally, videos from the course are posted here, if you wanted to watch them.

As for the repro, I ran and confirmed that this was looping indefinitely in 1.3.x and fixed in 1.4.0.

train_iter = image.ImageDetIter(
        batch_size=1000, 
        data_shape=(3, data_shape, data_shape),
        path_imgrec='./data/pikachu_train.rec',
        path_imgidx='./data/pikachu_train.idx',
        #shuffle=True, 
        #mean=True,
        #rand_crop=1, 
        min_object_covered=0.95,
        last_batch_handle='pad',
        max_attempts=5)
train_iter.reset()
for i,data in enumerate(train_iter):    
    print((i+1)) # goes forever on 1.3.0 but not 1.4.0

Hope that helps,

Vishaal

Vishaal
  • 735
  • 3
  • 13