0

We are trying to get the detected object names using Python and YOLOv8 with the following code.

import cv2
from ultralytics import YOLO


def main():
    cap = cv2.VideoCapture(0)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)

    model = YOLO("yolov8n.pt")

    while True:
        ret, frame = cap.read()
        result = model(frame, agnostic_nms=True)[0]

        print(result)

        if cv2.waitKey(30) == 27:
            break

    cap.release()
    cv2.destroyAllWindows()


if __name__ == "__main__":
    main()

The following two types are shown on the log.

0: 384x640 1 person, 151.2ms
Speed: 0.6ms preprocess, 151.2ms inference, 1.8ms postprocess per image at shape (1, 3, 640, 640)

The second log is the one we displayed using print, how do we get the person from now on? Presumably we get the person by giving 0 to the names, but where do we get the 0 from?

ultralytics.yolo.engine.results.Results object with attributes:

boxes: ultralytics.yolo.engine.results.Boxes object
keypoints: None
keys: ['boxes']
masks: None
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
orig_img: array([[[51, 58, 64],
        [52, 59, 65],
        [54, 59, 65],
        ...,
        [64, 68, 74],
        [62, 67, 73],
        [62, 67, 73]],

       [[51, 58, 64],
        [53, 59, 65],
        [54, 59, 65],
        ...,
        [63, 68, 74],
        [62, 67, 73],
        [62, 67, 73]],

       [[53, 58, 64],
        [53, 58, 64],
        [53, 58, 64],
        ...,
        [61, 67, 73],
        [61, 67, 73],
        [61, 67, 73]],

       ...,

       [[43, 48, 58],
        [42, 47, 57],
        [41, 46, 56],
        ...,
        [24, 35, 49],
        [23, 34, 48],
        [23, 34, 48]],

       [[44, 48, 59],
        [43, 47, 57],
        [42, 46, 56],
        ...,
        [26, 35, 49],
        [26, 35, 49],
        [24, 33, 48]],

       [[45, 48, 59],
        [43, 45, 56],
        [40, 43, 54],
        ...,
        [26, 35, 49],
        [26, 35, 49],
        [25, 33, 48]]], dtype=uint8)
orig_shape: (720, 1280)
path: 'image0.jpg'
probs: None
speed: {'preprocess': 1.6682147979736328, 'inference': 79.47301864624023, 'postprocess': 1.0020732879638672}

We would like to know the solution in this way. But if it is not possible, we can use another method if it is a combination of Python and YOLOv8. We plan to display bounding boxes and object names.

Additional Information

I changed the code as follows.

        ret, frame = cap.read()
        # result = model(frame, agnostic_nms=True)[0]
        result = model([frame])[0]

        boxes = result.boxes
        masks = result.masks
        probs = result.probs

        print("[boxes]==============================")
        print(boxes)
        print("[masks]==============================")
        print(masks)
        print("[probs]==============================")
        print(probs)

After all, the following person is not included. How should we determine that?

[boxes]==============================
WARNING ⚠️ 'Boxes.boxes' is deprecated. Use 'Boxes.data' instead.
ultralytics.yolo.engine.results.Boxes object with attributes:

boxes: tensor([[4.7356e+01, 7.2858e+00, 1.1974e+03, 7.1092e+02, 8.6930e-01, 0.0000e+00]])
cls: tensor([0.])
conf: tensor([0.8693])
data: tensor([[4.7356e+01, 7.2858e+00, 1.1974e+03, 7.1092e+02, 8.6930e-01, 0.0000e+00]])
id: None
is_track: False
orig_shape: tensor([ 720, 1280])
shape: torch.Size([1, 6])
xywh: tensor([[ 622.4028,  359.1004, 1150.0942,  703.6293]])
xywhn: tensor([[0.4863, 0.4988, 0.8985, 0.9773]])
xyxy: tensor([[  47.3557,    7.2858, 1197.4500,  710.9150]])
xyxyn: tensor([[0.0370, 0.0101, 0.9355, 0.9874]])
[masks]==============================
None
[probs]==============================
None
Ganessa
  • 782
  • 2
  • 7
  • 24
  • Does this answer your question? [YOLOv8 get predicted class name](https://stackoverflow.com/questions/75277492/yolov8-get-predicted-class-name) – Mike B May 03 '23 at 11:28
  • I have not confirmed that yet. But I see it looks like the answer we were looking for. We didn't find it before the question. Thanks. – Ganessa May 04 '23 at 22:11
  • Please, confirm if it is the case so that be don't bloat stack overflow with duplicated questions – Mike B May 05 '23 at 07:24

3 Answers3

1

There are probably better solutions to this, but I couldn't really find anything useful either, so I did this:

while True:
    ret, frame = cap.read()
    results = model(frame, agnostic_nms=True)[0]

    if not results or len(results) == 0:
        continue

    for result in results:

        detection_count = result.boxes.shape[0]

        for i in range(detection_count):
            cls = int(result.boxes.cls[i].item())
            name = result.names[cls]
            confidence = float(result.boxes.conf[i].item())
            bounding_box = result.boxes.xyxy[i].cpu().numpy()

            x = int(bounding_box[0])
            y = int(bounding_box[1])
            width = int(bounding_box[2] - x)
            height = int(bounding_box[3] - y)
adsdf
  • 146
  • 1
  • 6
0

I tried to get it as follows I have not yet figured out if this is correct.

        print(result.names[int(result.boxes.cls[0])])
Ganessa
  • 782
  • 2
  • 7
  • 24
-1
inputs = [img, img]  # list of numpy arrays
results = model(inputs)  # list of Results objects

for result in results:
    boxes = result.boxes  # Boxes object for bbox outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    probs = result.probs  # Class probabilities for classification outputs

Refer Yolov8 Docs

Nuhman Pk
  • 112
  • 7
  • Thanks for your answer. We had already tried it, but still it does not contain anything indicating `person`. – Ganessa Apr 24 '23 at 04:19