0

I am using this implementation of FRCNN for training on my dataset:

https://github.com/kbardool/keras-frcnn

during training I get random exceptions with no stack trace:

708/1000 [====================>.........] - ETA: 289s - rpn_cls: 0.1376 - rpn_regr: 0.3020 - detector_cls: 
709/1000 [====================>.........] - ETA: 288s - rpn_cls: 0.1376 - rpn_regr: 0.3020 - detector_cls: 
710/1000 [====================>.........] - ETA: 287s - rpn_cls: 0.1374 - rpn_regr: 0.3021 - detector_cls: 
711/1000 [====================>.........] - ETA: 286s - rpn_cls: 0.1373 - rpn_regr: 0.3018 - detector_cls: 
712/1000 [====================>.........] - ETA: 284s - rpn_cls: 0.1371 - rpn_regr: 0.3017 - detector_cls: 
713/1000 [====================>.........] - ETA: 283s - rpn_cls: 0.1370 - rpn_regr: 0.3019 - detector_cls: 
714/1000 [====================>.........] - ETA: 282s - rpn_cls: 0.1370 - rpn_regr: 0.3017 - detector_cls: 0.0783 - detector_regr: 0.0686
Exception: 'a' cannot be empty unless no samples are taken

715/1000 [====================>.........] - ETA: 281s - rpn_cls: 0.1369 - rpn_regr: 0.3015 - detector_cls: 
716/1000 [====================>.........] - ETA: 280s - rpn_cls: 0.1367 - rpn_regr: 0.3013 - detector_cls: 
717/1000 [====================>.........] - ETA: 279s - rpn_cls: 0.1365 - rpn_regr: 0.3009 - detector_cls: 
718/1000 [====================>.........] - ETA: 278s - rpn_cls: 0.1363 - rpn_regr: 0.3011 - detector_cls:

while I get error message, loss still goes down, what can be the reason and how can I fix it?

Stepan Yakovenko
  • 8,670
  • 28
  • 113
  • 206

2 Answers2

1

I have been seeing the same error message and it doesn't seem to effect the outcome of the training session. I have noticed when my bounding boxes on training data are < 20px this error presents itself. Let me know if you actually determine what's causing the issue!

Charlie E
  • 9
  • 3
0

I've found a root cause and have some superficial fix, but i don't have 100% understanding of what is going on. The exception happens at second attempt to perform np.random.choice. If first choice fails with exception, then author is caching the exception and tries to select without deduplication. However if neg_samples is empty then second call produces an exception.

try:
    selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=False).tolist()
except:
    selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=True).tolist()

I have "fixed" it like this:

try:
  selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=False).tolist()
except:
    selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples) if len(neg_samples)>0 else 0, replace=True).tolist()

Again, i am not sure if its ok to sample in case neg_samples is empty. May be someone who has a better understanding of the algorithm can give reasonable comment here.

Stepan Yakovenko
  • 8,670
  • 28
  • 113
  • 206