I have ~50000 images and annotation files for training a YOLOv5 object detection model. I've trained a model no problem using just CPU on another computer, but it takes too long, so I need GPU training. My problem is, when I try to train with a GPU I keep getting this error:
OSError: [WinError 1455] The paging file is too small for this operation to complete
This is the command I'm executing:
train.py --img 640 --batch 4 --epochs 100 --data myyaml.yaml --weights yolov5l.pt
CUDA and PyTorch have successfully been installed and are available. The following command installed with no errors:
pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio===0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
I've found other people online with similar issues and have fixed it by changing the num_workers = 8
to num_workers = 1
. When I tried this, training started and seemed to get past the point where the paging file is too small
error appears, but then crashes a couple hours later. I've also increased the virtual memory available on my GPU as per this video (https://www.youtube.com/watch?v=Oh6dga-Oy10) that also didn't work. I think it's a memory issue because some of the times it crashes I get a low memory warning from my computer.
Any help would be much appreciated.