Training custom model

Question

I am trying to train my dataset on yolov5 I normalized data as discussed in the docs on github but I always end up with this error.

              from  n    params  module                                  arguments                     
  0             -1  1      8800  models.common.Focus                     [3, 80, 3]                    
  1             -1  1    115520  models.common.Conv                      [80, 160, 3, 2]               
  2             -1  1    315680  models.common.BottleneckCSP             [160, 160, 4]                 
  3             -1  1    461440  models.common.Conv                      [160, 320, 3, 2]              
  4             -1  1   3311680  models.common.BottleneckCSP             [320, 320, 12]                
  5             -1  1   1844480  models.common.Conv                      [320, 640, 3, 2]              
  6             -1  1  13228160  models.common.BottleneckCSP             [640, 640, 12]                
  7             -1  1   7375360  models.common.Conv                      [640, 1280, 3, 2]             
  8             -1  1   4099840  models.common.SPP                       [1280, 1280, [5, 9, 13]]      
  9             -1  1  20087040  models.common.BottleneckCSP             [1280, 1280, 4, False]        
 10             -1  1    820480  models.common.Conv                      [1280, 640, 1, 1]             
 11             -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 12        [-1, 6]  1         0  models.common.Concat                    [1]                           
 13             -1  1   5435520  models.common.BottleneckCSP             [1280, 640, 4, False]         
 14             -1  1    205440  models.common.Conv                      [640, 320, 1, 1]              
 15             -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 16        [-1, 4]  1         0  models.common.Concat                    [1]                           
 17             -1  1   1360960  models.common.BottleneckCSP             [640, 320, 4, False]          
 18             -1  1    922240  models.common.Conv                      [320, 320, 3, 2]              
 19       [-1, 14]  1         0  models.common.Concat                    [1]                           
 20             -1  1   5025920  models.common.BottleneckCSP             [640, 640, 4, False]          
 21             -1  1   3687680  models.common.Conv                      [640, 640, 3, 2]              
 22       [-1, 10]  1         0  models.common.Concat                    [1]                           
 23             -1  1  20087040  models.common.BottleneckCSP             [1280, 1280, 4, False]        
 24   [17, 20, 23]  1         0  models.yolo.Detect                      [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]]]
Traceback (most recent call last):
  File "train.py", line 404, in <module>
    train(hyp)
  File "train.py", line 80, in train
    model = Model(opt.cfg).to(device)
  File "/content/yolov5/models/yolo.py", line 62, in __init__
    m.stride = torch.tensor([128 / x.shape[-2] for x in self.forward(torch.zeros(1, ch, 128, 128))])  # forward
  File "/content/yolov5/models/yolo.py", line 90, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/content/yolov5/models/yolo.py", line 107, in forward_once
    x = m(x)  # run
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/yolov5/models/yolo.py", line 26, in forward
    x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
RuntimeError: shape '[1, 3, 8, 16, 16]' is invalid for input of size 81920

These are the flags used

!python train.py --img 1024 --batch 4 --epochs 30 \
  --data ./data/mask.yaml --cfg ./models/yolov5x.yaml --weights yolov5x.pt \
   --cache --name maskmodel

this is the file structure

Hello, can you post a minimal reproducible sample of code so we can diagnose (as code, not as image please) ? But if I can try a blind guess, the `view` operation that is mentionned in the last line of the error is trying to perform a `view` of a tensor that has 81920 elements, while you are asking a resulting shape with 1*3*8*16*16 = 6144 elements (maybe the `bs=1`is a mistake ?). You should check the size of your tensor right before this operation — trialNerror, Sep 26 '20 at 15:51

score 1 · Accepted Answer · answered Sep 27 '20 at 07:23

For anyone who was facing the same problem, I found my issue when splitting data for training and validation make sure you pick a seed. Furthermore, when normalizing files for yolov5 input make sure that they ID as number without text in them. Thank you

Training custom model

1 Answers1