I have a script that grabs an application's screenshot and displays it. it works quite nicely on my machine like a video with around 60FPS. Now I want to use a yolov5 object detection model on these frames, with TorchHub, as advised here.
The following works:
import os
os.getcwd()
from PIL import ImageGrab
import numpy as np
import cv2
import pyautogui
import win32gui
import time
from mss import mss
from PIL import Image
import tempfile
os.system('calc')
sct = mss()
xx=1
tstart = time.time()
while xx<10000:
hwnd = win32gui.FindWindow(None, 'Calculator')
left_x, top_y, right_x, bottom_y = win32gui.GetWindowRect(hwnd)
#screen = np.array(ImageGrab.grab( bbox = (left_x, top_y, right_x, bottom_y ) ) )
bbox = {'top': top_y, 'left': left_x, 'width': right_x-left_x, 'height':bottom_y-top_y }
screen = sct.grab(bbox)
scr = np.array(screen)
cv2.imshow('window', scr)
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
xx+=1
cv2.destroyAllWindows()
tend = time.time()
print(xx/(tend-tstart))
print((tend-tstart))
os.system('taskkill /f /im calculator.exe')
Below I try to import torch
and use my previously trained model,
screen = sct.grab(bbox)
scr = np.array(screen)
result = model(scr, size=400)
result.save("test.png") #this gives a TypeError: save() takes 1 positional argument but 2 were given
result.show() #this opens a new Paint instance for every frame instead of keeping the same window.
# The shown image is also in a wrong color channel
scr = cv2.imread("test.png")
# How can I use the `result` as argument to cv2.imshow(),
# without saving to disk if possible?
My questions:
result.show()
shows an image with wrong color channel compared tocv2.imshow()
, how can I ensure that the image being fed tomodel
is on the correct channel?- The performance of classification and detection drastically decrease compared to the training validation, perhaps because of 1?
- Do you know how I can display the result model image with bounding boxes in a single window like what
cv2.imshow()
does ? (result.show()
opens a new Paint process instance for each frame) ? How can I save this result image to disk and find more documentation on how to interact withmodel
objects in general?