Use SSIM to calculate diff of two video frames

Question

I'm trying to use SSIM to find differences between two images (from a camera, so view is fixed). But while the two images are almost identical except some small light changes, it shows tons of differences. My impression was SSIM would be more robust to brightness and lighting changes.

Is there anything (thresholds tuning, algorithms, ...) in the code to change so it would perform better? Alternatively, what would be better approaches to try here?

And the diff mask:

Here is my code:

from skimage.metrics import structural_similarity
import cv2
import numpy as np

f1='g0.png'
f2='g1.png'
before = cv2.imread(f1)
after = cv2.imread(f2)

# Convert images to grayscale
before_gray = cv2.cvtColor(before, cv2.COLOR_BGR2GRAY)
after_gray = cv2.cvtColor(after, cv2.COLOR_BGR2GRAY)

# Compute SSIM between two images
(score, diff) = structural_similarity(before_gray, after_gray, full=True)
print("Image similarity", score)

# The diff image contains the actual image differences between the two images
# and is represented as a floating point data type in the range [0,1] 
# so we must convert the array to 8-bit unsigned integers in the range
# [0,255] before we can use it with OpenCV
diff = (diff * 255).astype("uint8")

# Threshold the difference image, followed by finding contours to
# obtain the regions of the two input images that differ
thresh = cv2.threshold(diff, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
contours = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

mask = np.zeros(before.shape, dtype='uint8')
filled_after = after.copy()

for c in contours:
    area = cv2.contourArea(c)
    if area > 40:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(before, (x, y), (x + w, y + h), (36,255,12), 2)
        cv2.rectangle(after, (x, y), (x + w, y + h), (36,255,12), 2)
        cv2.drawContours(mask, [c], 0, (0,255,0), -1)
        cv2.drawContours(filled_after, [c], 0, (0,255,0), -1)

cv2.imshow('mask',mask)

cv2.waitKey(0)

What kind of value numbers are there before your uint8 conversion? If you dont know that there are differences otsu thresholding is not the right choice, because it will choose some threshold which gives you some foreground. — Micka, Aug 27 '23 at 06:35
What you coild do is to record a longer set of "no differences, only lighting chamges" sequences and from those choose a threshold that gives no differences. Then test with some "there are differences" images, whether the threshold is ok. — Micka, Aug 27 '23 at 06:38
Idk what are the values before uint. But for the threshold comment, how can I change the threshold? It's not a value we can set, it's otsu. — Tina J, Aug 27 '23 at 12:27
If you want a fixed threshold you can just use thresh = diff > value . At first if I were you I would just cv2.imshow diff, to see how the values look after uint concersion. If there are many white pixels then the conversion with factor 255 is a problem. — Micka, Aug 27 '23 at 13:29
SSIM is not robust against lightning changes, not sure what made you think that. It is intended to quantify the perceptual degradation due to compression and other artifacts. Without knowing what your ultimate goal is, but assuming you want to detect stuff happening in the camera view, I would suggest you look into background modeling with Gaussian mixture model: http://www.ai.mit.edu/projects/vsam/Publications/stauffer_cvpr98_track.pdf — Cris Luengo, Aug 27 '23 at 17:41
I see. My final goal is to detect changes in two images. Do you happen to know any good approaches for that? I want it to be more robust to shadows and light changes — Tina J, Aug 27 '23 at 18:04

Use SSIM to calculate diff of two video frames

0 Answers0