Obtaining detected object names using YOLOv8

Question

We are trying to get the detected object names using Python and YOLOv8 with the following code.

import cv2
from ultralytics import YOLO


def main():
    cap = cv2.VideoCapture(0)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)

    model = YOLO("yolov8n.pt")

    while True:
        ret, frame = cap.read()
        result = model(frame, agnostic_nms=True)[0]

        print(result)

        if cv2.waitKey(30) == 27:
            break

    cap.release()
    cv2.destroyAllWindows()


if __name__ == "__main__":
    main()

The following two types are shown on the log.

0: 384x640 1 person, 151.2ms
Speed: 0.6ms preprocess, 151.2ms inference, 1.8ms postprocess per image at shape (1, 3, 640, 640)

The second log is the one we displayed using print, how do we get the person from now on? Presumably we get the person by giving 0 to the names, but where do we get the 0 from?

ultralytics.yolo.engine.results.Results object with attributes:

boxes: ultralytics.yolo.engine.results.Boxes object
keypoints: None
keys: ['boxes']
masks: None
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
orig_img: array([[[51, 58, 64],
        [52, 59, 65],
        [54, 59, 65],
        ...,
        [64, 68, 74],
        [62, 67, 73],
        [62, 67, 73]],

       [[51, 58, 64],
        [53, 59, 65],
        [54, 59, 65],
        ...,
        [63, 68, 74],
        [62, 67, 73],
        [62, 67, 73]],

       [[53, 58, 64],
        [53, 58, 64],
        [53, 58, 64],
        ...,
        [61, 67, 73],
        [61, 67, 73],
        [61, 67, 73]],

       ...,

       [[43, 48, 58],
        [42, 47, 57],
        [41, 46, 56],
        ...,
        [24, 35, 49],
        [23, 34, 48],
        [23, 34, 48]],

       [[44, 48, 59],
        [43, 47, 57],
        [42, 46, 56],
        ...,
        [26, 35, 49],
        [26, 35, 49],
        [24, 33, 48]],

       [[45, 48, 59],
        [43, 45, 56],
        [40, 43, 54],
        ...,
        [26, 35, 49],
        [26, 35, 49],
        [25, 33, 48]]], dtype=uint8)
orig_shape: (720, 1280)
path: 'image0.jpg'
probs: None
speed: {'preprocess': 1.6682147979736328, 'inference': 79.47301864624023, 'postprocess': 1.0020732879638672}

We would like to know the solution in this way. But if it is not possible, we can use another method if it is a combination of Python and YOLOv8. We plan to display bounding boxes and object names.

Additional Information

I changed the code as follows.

        ret, frame = cap.read()
        # result = model(frame, agnostic_nms=True)[0]
        result = model([frame])[0]

        boxes = result.boxes
        masks = result.masks
        probs = result.probs

        print("[boxes]==============================")
        print(boxes)
        print("[masks]==============================")
        print(masks)
        print("[probs]==============================")
        print(probs)

After all, the following person is not included. How should we determine that?

[boxes]==============================
WARNING ⚠️ 'Boxes.boxes' is deprecated. Use 'Boxes.data' instead.
ultralytics.yolo.engine.results.Boxes object with attributes:

boxes: tensor([[4.7356e+01, 7.2858e+00, 1.1974e+03, 7.1092e+02, 8.6930e-01, 0.0000e+00]])
cls: tensor([0.])
conf: tensor([0.8693])
data: tensor([[4.7356e+01, 7.2858e+00, 1.1974e+03, 7.1092e+02, 8.6930e-01, 0.0000e+00]])
id: None
is_track: False
orig_shape: tensor([ 720, 1280])
shape: torch.Size([1, 6])
xywh: tensor([[ 622.4028,  359.1004, 1150.0942,  703.6293]])
xywhn: tensor([[0.4863, 0.4988, 0.8985, 0.9773]])
xyxy: tensor([[  47.3557,    7.2858, 1197.4500,  710.9150]])
xyxyn: tensor([[0.0370, 0.0101, 0.9355, 0.9874]])
[masks]==============================
None
[probs]==============================
None

Does this answer your question? [YOLOv8 get predicted class name](https://stackoverflow.com/questions/75277492/yolov8-get-predicted-class-name) — Mike B, May 03 '23 at 11:28
I have not confirmed that yet. But I see it looks like the answer we were looking for. We didn't find it before the question. Thanks. — Ganessa, May 04 '23 at 22:11
Please, confirm if it is the case so that be don't bloat stack overflow with duplicated questions — Mike B, May 05 '23 at 07:24

score 1 · Accepted Answer · answered Apr 24 '23 at 05:22

There are probably better solutions to this, but I couldn't really find anything useful either, so I did this:

while True:
    ret, frame = cap.read()
    results = model(frame, agnostic_nms=True)[0]

    if not results or len(results) == 0:
        continue

    for result in results:

        detection_count = result.boxes.shape[0]

        for i in range(detection_count):
            cls = int(result.boxes.cls[i].item())
            name = result.names[cls]
            confidence = float(result.boxes.conf[i].item())
            bounding_box = result.boxes.xyxy[i].cpu().numpy()

            x = int(bounding_box[0])
            y = int(bounding_box[1])
            width = int(bounding_box[2] - x)
            height = int(bounding_box[3] - y)

The answer I wanted to know is this. Thank you. – Ganessa Apr 24 '23 at 05:55 — Ganessa, Apr 24 '23 at 05:55

score 0 · Answer 2 · answered Apr 24 '23 at 04:42

0

I tried to get it as follows I have not yet figured out if this is correct.

        print(result.names[int(result.boxes.cls[0])])

answered Apr 24 '23 at 04:42

Ganessa

782
2
7
24

score -1 · Answer 3 · answered Apr 22 '23 at 15:05

-1

inputs = [img, img]  # list of numpy arrays
results = model(inputs)  # list of Results objects

for result in results:
    boxes = result.boxes  # Boxes object for bbox outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    probs = result.probs  # Class probabilities for classification outputs

Refer Yolov8 Docs

answered Apr 22 '23 at 15:05

Nuhman Pk

112
7

Thanks for your answer. We had already tried it, but still it does not contain anything indicating `person`. – Ganessa Apr 24 '23 at 04:19

Obtaining detected object names using YOLOv8

Additional Information

3 Answers3