I'm new on gluon and I decided to run the examples to get familiar with the coding style (I used keras a couple years ago and this hybrid style is a little bit confusing to me).
My problem is that I can run the examples, but after successfully executing every cell on this example (it's a jupyter notebook) I upload an external image and the net seems to be incapable of detecting any object. I pasted the same cell on the 02. Predict with pre-trained Faster RCNN models and the pre-trained net had no problem detecting every person on the image, so it seems to me that the model in the example is not being trained correctly.
Has this happened to anyone else?
Am I missing something?
Thank you in advance!
(by the way, I have try uncommenting the 32th line of the training loop (the one with utograd.backward), changing the break-if limit on the same loop with no luck)
LINKS
I'm having this trouble while executing the original examples plus the cell bellow.
02) https://gluon-cv.mxnet.io/build/examples_detection/demo_faster_rcnn.html
06) https://gluon-cv.mxnet.io/build/examples_detection/train_faster_rcnn_voc.html
My test image
Cell to detect objects on the image
short, max_size = 600, 800
RCNN_transform = presets.rcnn.FasterRCNNDefaultTrainTransform(short, max_size)
myImg = 'unnamed.jpg'
x, img = data.transforms.presets.rcnn.load_test(myImg)
box_ids, scores, bboxes = net(x)
ax = utils.viz.plot_bbox(img, bboxes[0], scores[0], box_ids[0], class_names=net.classes)
plt.show()
system info (if relevant)
I am using my personal computer and I also am using google colab, with the same results, but just in case...
OS: Ubuntu 18.04
hardware
$ hwinfo --short
cpu:
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2700 MHz
graphics card:
nVidia GM107M [GeForce GTX 960M]
Intel HD Graphics 530
NVidia driver
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 960M Off | 00000000:02:00.0 Off | N/A |
| N/A 41C P5 N/A / N/A | 665MiB / 4046MiB | 23% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2560 G /usr/lib/xorg/Xorg 308MiB |
| 0 2921 G /usr/bin/gnome-shell 132MiB |
| 0 3741 G ...quest-channel-token=7390050445218241480 31MiB |
| 0 5455 G ...AAAAAAAAAAAACAAAAAAAAAA= --shared-files 176MiB |
+-----------------------------------------------------------------------------+
CUDA
$nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
MxNet and Gluon installed by
$ pip install mxnet-cu102mkl
$ pip install --upgrade mxnet-cu102mkl gluoncv
EDIT I have been making modifications to the training loop, this is what I have so far. The first block of lines right after the third for loop are just to have the data stored on the GPU.
#net.hybridize()
epochs = 50
for epoch in range(epochs):
print("epoch: ", epoch,"---------------------------------")
batch_size = 10
for ib, batch in enumerate(train_loader):
#print(ib)
if ib > 500:
break
for dataa, label, rpn_cls_targets, rpn_box_targets, rpn_box_masks in zip(*batch):
dataa = dataa.as_in_context(mx.gpu(0))
label = label.as_in_context(mx.gpu(0)).expand_dims(0)
rpn_cls_targets = rpn_cls_targets.as_in_context(mx.gpu(0))
rpn_box_targets = rpn_box_targets.as_in_context(mx.gpu(0))
rpn_box_masks = rpn_box_masks.as_in_context(mx.gpu(0))
gt_label = label[:, :, 4:5]
gt_box = label[:, :, :4]
with autograd.record():
# network forward
cls_preds, box_preds, roi, samples, matches, rpn_score, rpn_box, anchors, cls_targets, box_targets, box_masks, _ = net(dataa.expand_dims(0), gt_box, gt_label)
# losses of rpn
rpn_score = rpn_score.squeeze(axis=-1)
num_rpn_pos = (rpn_cls_targets >= 0).sum()
rpn_loss1 = rpn_cls_loss(rpn_score, rpn_cls_targets,rpn_cls_targets >= 0) * rpn_cls_targets.size / num_rpn_pos
rpn_loss2 = rpn_box_loss(rpn_box, rpn_box_targets,rpn_box_masks) * rpn_box.size / num_rpn_pos
# losses of rcnn
num_rcnn_pos = (cls_targets >= 0).sum()
rcnn_loss1 = rcnn_cls_loss(cls_preds, cls_targets,cls_targets >= 0) * cls_targets.size / cls_targets.shape[0] / num_rcnn_pos
rcnn_loss2 = rcnn_box_loss(box_preds, box_targets, box_masks) * box_preds.size / box_preds.shape[0] / num_rcnn_pos
# some standard gluon training steps:
autograd.backward([rpn_loss1, rpn_loss2, rcnn_loss1, rcnn_loss2])
trainer.step(batch_size)
I have doubts about the trainer, I found this on other examples, but I'm not sure if this works in this context.
trainer = gluon.Trainer(net.collect_params(), 'sgd',{'learning_rate': 0.01, 'wd': 0.05, 'momentum': 0.9})
EDIT
here's a copy of the .ipynb file I've been working on (google-colab version) https://drive.google.com/file/d/1WevimDyTP1lvq_A0OBRMgC-PH8pK4iBv/view?usp=sharing