Match anime cel to corresponding frame (using template/feature matching?)

Question

I have an image of a cel (a painted plastic sheet layer scanned to create a frame for older animated movies) from Kiki's Delivery Service, as below:

and I want to be able to match this to its exact (or close enough) matching frame in the movie Kiki's Delivery Service:

To be clear, I have a folder containing ~12,000 frames from the movie covering the full runtime, so the task is to iterate over these to algorithmically find the frame that is 'most similar' to the cel.

What I want

To exactly match a cel to its corresponding movie frame
What this probably means: a score (between 0 and 1 inclusive) for each of the 12,000 frames where a 1 means perfect match to the cel and a 0 means absolutely no match to the cel, with the score maximising for the actual corresponding frame. It could also be the case that the algorithm terminates when a score > 0.99 (say) is found

Question

What is the optimal approach for doing this?

Observations

Template matching seems to be sub-optimal, because it seems to assume finding a small sub-image in a much larger image
I'm aware that 'image pyramids' can be used to build up a set of smaller and smaller images, but this seems expensive in terms of time and computation
Feature matching might be a better approach, as it seems to work independently of scale and rotation

What I have tried

Removing the whitespace background in the cel. This is because the whitespace is irrelevant and I don't want it to influence any algorithm (i.e. the algorithm tries to match only with images with lots of white background)

import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt

cel = cv.imread('./cels/kikis_delivery_service/1.jpg', 1)
cel_grey = cv.cvtColor(cel, cv.COLOR_BGR2GRAY)

ret, mask = cv.threshold(cel_grey, 220, 255, cv.THRESH_BINARY_INV)
b, g, r = cv.split(cel)
rgba = [b, g, r, mask]
dst = cv.merge(rgba, 4)
cv.imwrite('test.png', dst)

Template matching, but the results were not good (scores < 0.3 using cv.TM_CCOEFF_NORMED)

Any help and suggestions on this would be much appreciated.

Classical vision methods are not the best for this kind of application, I'd suggest to research a Deep Learning approach, perhaps Siamese Networks, which are capable of measuring similarity between two inputs. — stateMachine, Apr 02 '23 at 22:37
Thank you @stateMachine, this is something I will explore. My slight reluctance to use deep learning is that I know the drawing on the cel should exactly match some subsection of a frame, i.e. not just be similar but be practically identical. Template/feature matching capture this structure by design (they assume the template is exactly there, modulo some transformations), whereas deep learning, unless I'm mistaken, isn't aware of this structure and will just try and make an educated guess? — tonkotsu, Apr 02 '23 at 22:49

Neotrash · Answer 1 · 2023-04-02T23:06:05.853

Not sure if this is what you're looking for but the mse or mean square error of an image can give you a measure of the "closeness" of two images. You could then sort through thees to find the closest image to the one you want but it could take some time.

import os
import cv2
import numpy as np

def mse(img1, img2):
   h, w = img1.shape
   diff = cv2.subtract(img1, img2)
   err = np.sum(diff**2)
   mse = err/(float(h*w))
   return mse

images = [f"path/to/folder{file}" for file in os.listdir("path/to/folder")]

img1 = cv2.imread("image_of_cel.png")
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)

image_errors = []

for image in images:
    img2 = cv2.imread(image)
    img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

    image_errors.append((image, mse(img1, img2)))

This code is not mine, it is taken from here and it won't give an error range from 0 -> 1

Match anime cel to corresponding frame (using template/feature matching?)

1 Answers1