I am trying out detectron2 and want to train the sample model.
When running the following code I get (<class 'RuntimeError'>, RuntimeError('No CUDA GPUs are available'), <traceback object at 0x7f42b094ebc0>)
. Find below the code:
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()
# import some common libraries
import matplotlib.pyplot as plt
import numpy as np
import cv2
# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.data.datasets import register_coco_instances
import random
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
import os
# To verify the data loading is correct, let's visualize the annotations of randomly selected samples in the training set:
register_coco_instances("fruits_nuts", {}, "../data/trainval.json", "../data/images")
fruits_nuts_metadata = MetadataCatalog.get("fruits_nuts")
dataset_dicts = DatasetCatalog.get("fruits_nuts")
'''
for d in random.sample(dataset_dicts, 3):
img = cv2.imread(d["file_name"])
visualizer = Visualizer(img[:, :, ::-1], metadata=fruits_nuts_metadata, scale=0.5)
vis = visualizer.draw_dataset_dict(d)
cv2.imshow('new', vis.get_image()[:, :, ::-1])
cv2.waitKey(0)
'''
# train model
cfg = get_cfg()
cfg.merge_from_file("../detectron2_repo/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.DATASETS.TRAIN = ("fruits_nuts",)
cfg.DATASETS.TEST = () # no metrics implemented for this dataset
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl" # initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.02
cfg.SOLVER.MAX_ITER = 300 # 300 iterations seems good enough, but you can certainly train longer
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 # faster, and good enough for this toy dataset
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 3 # 3 classes (data, fig, hazelnut)
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
I ran the script collect_env.py
from torch:
/home/project/.venv/bin/python /home/project/src/collect_env.py
Collecting environment information...
PyTorch version: 1.10.2+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.8.10 (default, Nov 26 2021, 20:14:08) [GCC 9.3.0] (64-bit runtime)
Python platform: Linux-5.13.0-27-generic-x86_64-with-glibc2.29
Is CUDA available: False
CUDA runtime version: 10.1.243
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.1
[pip3] torch==1.10.2
[pip3] torchvision==0.11.3
[conda] Could not collect
Process finished with exit code 0
I am having on the system a RTX3080 graphic card. However, it seems to me that its not found.
Any suggestions why?
Is there a way to run the training without CUDA?
I appreciate your replies!