1

I'm trying to work on video stabilization using python and template matching via skimage. The code is supposed to track a single point during the whole video but the tracking is awfully imprecise and I suspect it's not even working correctly

This is the track_point function which is supposed to take a video as an input and some coordinates of a point and then return an array of tracked points for each frame

from skimage.feature import match_template
from skimage.color import rgb2gray

def track_point(video, x, y, patch_size = 4, search_size = 40):

    length, height, width, _ = video.shape

    frame = rgb2gray(np.squeeze(video[1, :, :, :])) # convert image to grayscale
    x1 = int(max(1, x - patch_size / 2))
    y1 = int(max(1, y - patch_size / 2))
    x2 = int(min(width, x + patch_size / 2 - 1))
    y2 = int(min(height, y + patch_size / 2 - 1))
    template = frame[y1:y2, x1:x2] # cut the reference patch (template) from the first frame
    track_x = [x]
    track_y = [y]
    #plt.imshow(template)
    half = int(search_size/2)
    for i in range(1, length):
        prev_x = int(track_x[i-1])
        prev_y = int(track_y[i-1])
        frame = rgb2gray(np.squeeze(video[i, :, :, :])) # Extract current frame and convert it grayscale
        image = frame[prev_x-half:prev_x+half,prev_y-half:prev_y+half] # Cut-out a region of search_size x search_size from 'frame' with the center in the point's previous position (i-1)
        result = match_template(image, template, pad_input=False, mode='constant', constant_values=0) # Compare the region to template using match_template
        ij = np.unravel_index(np.argmax(result), result.shape) # Select best match (maximum) and determine its position. Update x and y and append new x,y values to track_x,track_y

        x, y = ij[::-1] 
        x += x1
        y += y1
        track_x.append(x)
        track_y.append(y)

    return track_x, track_y

And this is the implementation of the function

points = track_point(video, point[0], point[1])
# Draw trajectory on top of the first frame from video
image = np.squeeze(video[1, :, :, :]) 
figure = plt.figure()
plt.gca().imshow(image) 
plt.gca().plot(points[0], points[1]) 

I expect the plot to be somehow regular since the video isn't that shaky but it's not.

image

For some reason the graph is plotting almost all of the coordinates of the search template.

EDIT: Here's the link for the video: https://upload-video.net/a11073n9Y11-noau

What am I doing wrong?

Nik Vinter
  • 145
  • 10

0 Answers0