The main problem is with the ROI selection event and how it is being called currently. The current implementation is not dynamic which means we are unable to visualize what ROI we are trying to select. Also, we have started processing before even selecting the ROI.
The proper way to select the ROI is that once we have captured the first frame, register the mouse click event and visualize the frame indefinitely with imshow
and waitKey(n)
until some specific key is pressed. Alternatively, we may be able to achieve the same effect without the infinite loop by using waitKey(0)
(Not tested).
At this stage, we should be able to draw the desired ROI rectangle. The key factor here is the execution must be halted either by using an infinite loop or waitKey(0)
. Just registering the event is not enough. After ROI selection is done, then proceed with the rest of the code.
Some recommendations are as follows:
- Avoid the usage of global variables where possible
- Create separate window for ROI selection and discard it afterwards
- Create separate functions for each individual task
Following is the complete code demonstrating correct usage of mouse click event to select the ROI for video processing:
import numpy as np
import cv2
import matplotlib.pyplot as plt
ORIGINAL_WINDOW_TITLE = 'Original'
FIRST_FRAME_WINDOW_TITLE = 'First Frame'
DIFFERENCE_WINDOW_TITLE = 'Difference'
canvas = None
drawing = False # true if mouse is pressed
#Retrieve first frame
def initialize_camera(cap):
_, frame = cap.read()
return frame
# mouse callback function
def mouse_draw_rect(event,x,y,flags, params):
global drawing, canvas
if drawing:
canvas = params[0].copy()
if event == cv2.EVENT_LBUTTONDOWN:
drawing = True
params.append((x,y)) #Save first point
elif event == cv2.EVENT_MOUSEMOVE:
if drawing:
cv2.rectangle(canvas, params[1],(x,y),(0,255,0),2)
elif event == cv2.EVENT_LBUTTONUP:
drawing = False
params.append((x,y)) #Save second point
cv2.rectangle(canvas,params[1],params[2],(0,255,0),2)
def select_roi(frame):
global canvas
canvas = frame.copy()
params = [frame]
ROI_SELECTION_WINDOW = 'Select ROI'
cv2.namedWindow(ROI_SELECTION_WINDOW)
cv2.setMouseCallback(ROI_SELECTION_WINDOW, mouse_draw_rect, params)
roi_selected = False
while True:
cv2.imshow(ROI_SELECTION_WINDOW, canvas)
key = cv2.waitKey(10)
#Press Enter to break the loop
if key == 13:
break;
cv2.destroyWindow(ROI_SELECTION_WINDOW)
roi_selected = (3 == len(params))
if roi_selected:
p1 = params[1]
p2 = params[2]
if (p1[0] == p2[0]) and (p1[1] == p2[1]):
roi_selected = False
#Use whole frame if ROI has not been selected
if not roi_selected:
print('ROI Not Selected. Using Full Frame')
p1 = (0,0)
p2 = (frame.shape[1] - 1, frame.shape[0] -1)
return roi_selected, p1, p2
if __name__ == '__main__':
cap = cv2.VideoCapture(0)
#Grab first frame
first_frame = initialize_camera(cap)
#Select ROI for processing. Hit Enter after drawing the rectangle to finalize selection
roi_selected, point1, point2 = select_roi(first_frame)
#Grab ROI of first frame
first_frame_roi = first_frame[point1[1]:point2[1], point1[0]:point2[0], :]
#An empty image of full size just for visualization of difference
difference_image_canvas = np.zeros_like(first_frame)
while cap.isOpened():
ret, frame = cap.read()
if ret:
#ROI of current frame
roi = frame[point1[1]:point2[1], point1[0]:point2[0], :]
difference = cv2.absdiff(first_frame_roi, roi)
difference = cv2.GaussianBlur(difference, (3, 3), 0)
_, difference = cv2.threshold(difference, 18, 255, cv2.THRESH_BINARY)
#Overlay computed difference image onto the whole image for visualization
difference_image_canvas[point1[1]:point2[1], point1[0]:point2[0], :] = difference.copy()
cv2.imshow(FIRST_FRAME_WINDOW_TITLE, first_frame)
cv2.imshow(ORIGINAL_WINDOW_TITLE, frame)
cv2.imshow(DIFFERENCE_WINDOW_TITLE, difference_image_canvas)
key = cv2.waitKey(30) & 0xff
if key == 27:
break
else:
break
cap.release()
cv2.destroyAllWindows()
Pro Tip:
Sometimes, when a camera is initialized, it takes some time to warm-up depending upon the ambient light present in the room. You may consider skipping a few initial frames to let the camera settle from the initialization phase. It can be done by defining the initialize_camera
function in the above code as follows:
def initialize_camera(cap):
for i in range(0,60): #Skip first 60 frames
_, frame = cap.read()
return frame